A two-dimensional volatility basis set: 1. organic-aerosol mixing thermodynamics

. We develop the thermodynamic underpinnings of a two-dimensional volatility basis set (2D-VBS) employing saturation mass concentration ( C o ) and the oxygen content (O:C) to describe volatility, mixing thermodynamics, and chemical evolution of organic aerosol. The work ad-dresses a simple question: “Can we reasonably constrain organic-aerosol composition in the atmosphere based on only two measurable organic properties, volatility and the extent of oxygenation?” This is an extension of our earlier one-dimensional approach employing volatility only ( C ∗ = γ C o , where γ is an activity coefﬁcient). Using available constraints on bulk organic-aerosol composition, we argue that one can reasonably predict the composition of organics (car-bon, oxygen and hydrogen numbers) given a location in the C o – O:C space. Further, we argue that we can constrain the activity coefﬁcients at various locations in this space based on the O:C of the organic aerosol.


Introduction
It is well established that atmospheric organic aerosol (OA) exists as a complex mixture of thousands of individual organic compounds (Hildemann et al., 1991;Fraser et al., 1997Fraser et al., , 1998Goldstein et al., 2008). Furthermore, it is clear that a large fraction of OA consists of organic compounds that are difficult to separate in a gas chromatograph (Schauer et al., 1996), and that very often a majority of OA consists of quite highly oxidized organic molecules (Zhang et al., 2007;Aiken et al., 2008;Ng et al., 2009).
As an example, the most abundant identified organic molecule during the Pittsburgh Air Quality Study, generally a carboxylic acid, typically comprised about 1% of the total or-Correspondence to: N. M. Donahue (nmd@andrew.cmu.edu) ganic carbon mass Shrivastava et al., 2007). A typical 200 nm diameter particle contains 5-10 million individual organic molecules with molar weights around 200 g mole −1 . The most abundant molecule appears approximately 50 000 times in that single particle, while hundreds to thousands of molecules appear of order 100-1000 times each in that same particle.
A critical determinant of whether an organic compound is found in OA (as opposed to the gas phase) must be saturation vapor pressure. Here we shall use saturation mass concentration (C o and C * , in µg m −3 ) because mass measurements pervade aerosol science, and we shall use the terms "volatility" and "saturation concentration" interchangeably. The difference between C o and C * is that C o is the saturation concentration of a vapor over a pure (sub-cooled) liquid, while C * includes the activity coefficient, γ , in a mixture; thus C * = γ C o . We have previously presented a framework describing OA mass -whether or not the constituents are identified -with a series of volatility "bins" that are separated by one decade in C * , which we shall refer to as the "one-dimensional volatility basis set" (1D-VBS) . Here we shall extend the 1D-VBS to a second dimension in order to improve our ability to predict the thermodynamics, including organic mixing and polarity, and ultimately to coherently describe oxidation chemistry. For this 2D-VBS we shall add the extent of oxygenation -specifically O:C -as the second dimension.
Because OA is a complex mixture, the thermodynamics of mixtures are important (Pankow, 1994). From both a modeling and a theoretical perspective, it is desirable to be able to predict volatility under ambient conditions. It is also important to know whether different organics will tend to form a single, homogeneous phase or whether they will phase separate. One reason is that phase separation can sharply reduce the predicted mass of the less abundant phase (Bowman and Melton, 2004), while another is that separate phases will Published by Copernicus Publications on behalf of the European Geosciences Union. 3304 N. M. Donahue et al.: A two-dimensional volatility basis set allow distinct OA types to remain externally mixed (for example with different sized modes) while a single OA phase will tend to drive OA composition toward a uniform, internal mixture (Marcolli et al., 2004;Asa-Awuku et al., 2009).
A common simplification applied to complex organicaerosol mixtures in the atmosphere is to represent the mixture with a number of surrogate molecules. With this approach, one strives to carefully select a small set of surrogate molecules and then to calculate the thermodynamics as accurately as possible. There are a number of challenges to this approach.
Surrogate-based calculations require knowledge of the aerosol mixture composition and the abundance of each of the molecular subgroups (Bowman and Melton, 2004), coupled to an accurate calculation of the vapor pressures of the surrogates and their activity coefficients in solution. This is difficult. The multifunctional and long-chain molecules typically found in organic aerosols may not be accurately described by the estimation methods (Koo et al., 2003;Bowman and Melton, 2004). Finally, even if the thermodynamics of the surrogate mixture can be calculated exactly, it is not at all clear that this surrogate mixture will accurately represent the behavior of real OA mixtures. For instance, Clegg et al. (2008b) have shown that even a small set of molecules represented by a single surrogate in typical schemes can have a very wide range of individual vapor pressures.
An advantage of the surrogate-based methods is that inhomogeneous effects from extremely important constituents such as water and inorganic ions can be treated explicitly (Clegg et al., 2003). In this work we will focus on the thermodynamics of complex organic mixtures, neglecting for now these important effects. However, to the extent that hygroscopicity correlates with our second dimension , it may be possible in the future to parameterize the combined influence of water and inorganic ions on the thermodynamics.
Rather than predicting the phase partitioning of specific molecules (e.g. levoglucosan, etc.) or surrogates, we seek to infer the OA composition (the carbon, hydrogen, and oxygen numbers, n C , n O , n H ) of fractions of the OA constituents identified by their location in our 2-D volatility -O:C space. We also want to be able to understand the physical properties of the OA, including hygroscopicity as well as the potential for different classes of OA to phase separate, and we want to inform chemical mechanisms describing how these properties (including volatility and O:C) evolve during photochemical oxidation.
We thus develop an approach relying principally on data from in situ mass spectrometers, which measure a large fraction of the OA mass and are able to constrain the bulk atomic composition with good precision (Aiken et al., 2008). The most precisely constrained measure is the oxygen to carbon ratio (O:C). Volatility is more difficult to constrain, but several studies have shown recently that a combination of dilution and thermodenuder measurements can be inverted to constrain the volatility distribution of OA mixtures using our 1D-VBS for both chamber data Shilling et al., 2009;Kostenidou et al., 2009) and ambient data (Cappa and Jimenez, 2010).
Here we shall show that we can develop a self-consistent thermodynamic framework for OA mixtures based on location in our 2-D space. Our central assumption is that we can calculate the solvation energy of an organic constituent as a linear combination of interaction energies between different functional groups in the molecule and different functional groups in a solvent, and that in a rich, amorphous system, these interactions will be homogeneous and isotropic. In other words we will model OA as an amorphous collection of functional groups, defined by the mean properties of the OA and without any other regard to the molecular composition of the OA. This is motivated by the example of composition presented above -individual molecules in an OA particle are rarely if ever solvated by themselves but rather by a complex mixture of many molecules containing several different functional groups in many combinations. While the individual solvent molecules can not be completely enumerated, the necessary average properties can now be measured routinely.
Our hypothesis is that the great complexity of OA mixtures actually simplifies the problem by reducing the importance of any individual, unique molecular behaviors or properties in favor of the bulk average properties. We argue that we can reasonably describe the behavior of OA mixtures using only four parameters: a compound with a reference volatility (we use pentacosane), and three interaction terms expressing in turn carbon-carbon interactions (non-polar interactions), oxygen-oxygen interactions (polar interactions), and the nonideality of the carbon-oxygen interaction. We shall explore the implications of this, both in the laboratory, where different organics may or may not serve as good "seeds" for other organics (Song et al., 2007;Asa-Awuku et al., 2009;Vaden et al., 2010), and in the atmosphere.

Data and mean trends
We would like to be able to describe log 10 C * in terms of observable average molecular properties of OA -specifically the average carbon, hydrogen, and oxygen numbers (n C ,n H ,n O ). Initially we will focus on the pure-compound saturation concentrations (log 10 C o ). Accurate prediction of pure-compound vapor pressures requires structural information about the compounds, and accurate prediction of activity coefficients in mixtures requires an accurate description of the solvent. For atmospheric organic aerosols we simply do not yet have that luxury, so we must turn to simpler approaches.
There are a number of vapor-pressure estimation methods in the literature, with varying degrees of complexity. Rarey and co-workers (Nannoolal et al., 2004) argue convincingly that accurate empirical estimation should combine predictions of the normal boiling point of a compound with the enthalpy of vaporization. Because we have quite limited knowledge of the structure of organic-aerosol constituents, a convenient method is SIMPOL (Pankow and Asher, 2008), in which the sub-cooled liquid vapor pressure of a pure substance can be described as a linear combination of contributions from various functional groups: where b k is the effect of the k th "structural element" and ν k,i is the number of those elements in the molecule i. We shall build a group-contribution method very much like SIMPOL, replacing log 10 p 0 L,i with log 10 C o and focusing on the carbon and oxygen numbers (n C and n O ) of the organics. This extremely simple group-contribution method is justified in part by our limited knowledge of actual organic-aerosol composition, and in part because the information we do have suggests roughly constant relative fraction of various oxygenated functional groups, as we shall discuss below. Our objective in this paper is to describe log 10 C o = f (n C ,n O ), neglecting for the time being other contributions (for example, N, S) in favor of the oxygenated organic compounds that appear to comprise the bulk of OA.
Activity coefficients are a greater challenge. The universal functional activity coefficient (UNIFAC) method (Fredenslund et al., 1975) has been widely used to describe the activity of organic aerosol mixtures (Koo et al., 2003;Bowman and Melton, 2004;Pun et al., 2006). UNIFAC is a group-contribution method that gives the excess Gibbs free energy of each component in a specified mixture based on the non-ideality arising from the size and shapes of the constituent molecules and the non-ideality arising from their energetic interactions (Asher et al., 2002;Chandramouli et al., 2003). In keeping with the simplified approach we adopt for pure-component vapor pressures, we shall adopt a highly simplified description of activity coefficients as well.
Our first interest is the pure-component saturation concentrations. We plot data for a very wide range of compounds in Fig. 1, showing log 10 C o vs. n C . The data are from NIST as well as the literature (Koponen et al., 2007;Cappa et al., 2007). The oxygenated compounds all have functionality identified in ambient OA (-OH, =O, -C(O)OH, etc). We group molecules in classes defined by functionality (i.e. mono-alcohols, with a single -OH), and plot each class with a different symbol. The broad, colored bands in the background are volatility classes we use elsewhere ); here they serve to guide the eye through the various figures. The important attribute for atmospheric partitioning is that compounds in the light-green range (0 ≤ log 10 C o ≤ 2) will be semi-volatile in the atmosphere, existing in significant fractions in both the condensed and vapor phases. Compounds in the light red (−3 ≤ log 10 C o ≤ −1) Carbon Number and gray (log 10 C o ≤ −4) ranges will be almost completely condensed under typical ambient conditions. Note that there are relatively few data in the semi-volatile range and almost none below that. This is because a saturation concentration of 1 µg m −3 corresponds to approximately 10 −7 torr (1.3 × 10 −5 Pa); this is an exceedingly low value that is difficult to measure.
There are several shortcomings with the available data. The data in Fig. 1 become noisy approaching the semivolatile range and they do not include richly multifunctional compounds such as those that must comprise a large portion of the OOA pool. Some of the noise at a low C o may be related to uncertainty in converting to sub-cooled liquid values, and the rest may be due to measurement uncertainties associated with very low vapor pressures.
The diacid series does reach these low values. We show data from Koponen et al. (2007) and Cappa et al. (2007), adjusted to sub-cooled liquid vapor pressures. The atmospheric community has devoted considerable effort to measuring their vapor pressures (Tao and McMurry, 1989;Bilde and Pandis, 2001;Bilde et al., 2003;Koponen et al., 2007;Cappa et al., 2007Cappa et al., , 2008Booth et al., 2010;Soonsin et al., 2010). In spite of this effort the data differ by an order of magnitude in many cases -the diacid values are uncertain by a least half a decade, consistent with the scatter shown by the magenta squares in Fig. 1. The gross carbon-number trend is, however, still evident. The carbon-number trend is the most obvious feature of Fig. 1, with different classes forming parallel lines. This means that there is a well-defined effect of increasing n C , independent of molecule class. We can thus define a robust relationship between n C and C o : each class of molecule has a log 10 C o vs. n C slope of −0.475 decades per carbon, and we shall assume that this holds in general.
We also need a reference point, which we will define as the carbon number n C at C o = 1 µg m −3 for pure hydrocarbons. From Fig. 1 this is pentacosane (C 25 ), indicated with the black square containing a "C" marking a heavy, dashed black line. This reference n C may be uncertain by 2 carbons (corresponding to a 10% uncertainty in the group contribution for carbon to C o ). Some data for high-n C alkanes lie above the trend line, while others fall on it. Given the precision of the relationship for lower-n C alkanes, we believe this indicates the difficulty in measuring such low vapor pressures, and that the lower envelope represented by the trend line represents the true values for the n-alkanes.
The next step in developing our simple group-contribution method is to define the influence of oxygenation. A challenge is that we need to account for the influence of different oxygenated functional groups. Increasing functionalization decreases volatility systematically, with logarithmic factors given by the vertical offsets of the various trend lines in Fig. 1 from the pure hydrocarbon trend line. For example, each carbonyl group (=O, blue stars and yellow diamonds) decreases C o by 1 decade, and each alcohol group (-OH, green asterisks) decreases C o by 2.3 decades. One organic-acid group decreases C o by about 3.5 decades, or 1.75 decades per oxygen (the carbon is not counted because it is included in n C ), which is very close to the sum of (-OH) and (=O).
These effects are shown in a different way in Fig. 2, which casts the effects of functionalization in a 2-D space with log 10 C o on the x axis and O:C on the y axis. The different functional groups generate different slopes (shown with dashed red lines). The wedges in Fig. 2 describe the range of values one might expect as a carbon backbone is progressively functionalized. These wedges thus represent the effect of oxidation chemistry on the volatility and O:C of a reduced precursor, under the limiting condition that oxidation only functionalizes the backbone (i.e. n C remains unchanged).
Despite the uncertainties in the underlying data, there are enough data to suggest that these single-substituent effects on C o , described by the trend lines in Figs. 1 and 2, are not grossly erroneous. In the VBS we group compounds in bins separated by an order of magnitude in C o , and overall we estimate the uncertainty in extrapolation in these simple groupcontribution calculations to be somewhere between a factor of 10 and 100, meaning ±1 bin to either side of the estimated volatility bin for a known functionality.
However, if we are to construct a two-dimensional thermodynamic framework for bulk OA, we need to describe an average effect of added oxygen. Multiple lines of evidence suggest that (-OH), (=O) and (-C(O)OH) functionalities are ma- jor contributors to ambient oxygenated OA, known as OOA (Zhang et al., 2007). First, OOA is associated with large signals in the AMS at m/z=18 (H 2 O + ) and 44 (CO + 2 ), consistent with dehydration and decarboxylation of organic acids (Aiken et al., 2008). Second, NMR analyses of solventextracted filter samples suggest high abundances of these functionalities Moretti et al., 2008). Third, FTIR analyses of ambient filter samples show significant contributions of (-OH), (=O) and (-C(O)OH) functional groups (Hawkins et al., 2010). Finally, a van Krevlen analysis of ambient OA by Heald et al. (2010) is consistent with either hydroxycarbonyl or organic-acid functionalization dominating organic-aerosol oxidation on average (i.e. an average of one hydrogen is lost for every oxygen added to ambient OA).
Based on this evidence, we shall assume that, on average, OA consists of multifunctional organics containing organicacid terminal groups and an equal mixture of alcohols and ketones along the backbone. We can thus define the average effect of each added oxygen as decreasing C o by 1.7 decades, shown in Fig. 1 by thick, dashed lines (labeled by numbered squares at n C = 25) and in Fig. 2 with black lines. This is consistent with either acid or hydroxycarbonyl functionality, provided that these simple group-contribution values can be extrapolated to multifunctional compounds that presumably constitute OOA. What results is a simple, three-parameter group-contribution expression: with n 0 C = 25, b C = 0.475 and b O = 1.7. We have neglected other functionalities, but they fall within the range already defined. Hydroperoxides (-OOH) are assumed to be important under low-NO x conditions (Logan et al., 1981;Donahue and Prinn, 1993), and we see evidence for their formation in 2D-NMR samples of terpene ozonolysis (Maksymiuk et al., 2009). The effect of -OOH on C o is similar to -OH (Pankow and Asher, 2008). Ethers (-O-) may also be significant in ambient OA. They have a modest effect on volatility, slightly less than =O, of about 0.7 orders of magnitude per substituent (Pankow and Asher, 2008). Ethers may form through ring-closure reactions during the oxidation of larger organic compounds (Lim and Ziemann, 2005) or as part of accretion reactions involving small monomers such as glyoxal (Kroll et al., 2005;Carlton et al., 2006;Volkamer et al., 2007;Galloway et al., 2009). Again, their effect is not far outside of the =O line (even counterbalancing the -OOH effect).
We have also omitted organic nitrates. They are certainly important in high-NO x environments (Lim and Ziemann, 2005;Presto and Donahue, 2006;Zhang et al., 2006;Farmer et al., 2010). A nitrate group (-ONO 2 ) typically reduces vapor pressure by about 2.5 orders of magnitude (Pankow and Asher, 2008), so we could add b N = 2.5 to Eq. (2). We neglect them here only for clarity -nitrates and any other special functionality (organo sulfates, amines, alkenes, etc.) will need to be treated separately if they are to be modeled.
The diacids reveal the major shortcoming of this simple, linear expression. While there is significant uncertainty in their volatility, the diacids (solid magenta squares) consistently fall about 1 order of magnitude below the 4-oxygen (dashed green, labeled "O 4 ") trend line in Fig. 1, which should be very close to the value for two acid groups. This discrepancy will provide an important constraint as we relax the ideal, linear assumptions we have used so far.

Non-linear effects
The lower than predicted diacid volatilities suggest a need for some modification of Eq. (2). In addition, as we shall show in our discussion of non-ideal solution behavior below, a non-linear term is required to describe non-ideal solution behavior. For reasons that will become evident, we shall introduce non-linearity with the following expression: As we shall see the cross term is sufficient to provide interesting and realistic behavior -and we can constrain it. This non-linear group-contribution expression contains four free parameters: Carbon Number These four parameters underpin a reasonably accurate (and testable) thermodynamic framework.
Consistent with our earlier discussion, n 0 C = 25 and b C = 0.475. However, we shall use b O = 2.3 and b CO = −0.3. This is because the limit of Eq. (3) at low O:C contains n O (b O + 2b CO ) = 1.7n O and the data we are using to constrain these terms are heavily biased towards low O:C. We increased the overall effect of oxygenation to match the low volatility of diacids, even though the linear rules do a good job predicting mono-substituted organic acids. In addition, many of the sequences of oxygenated organics show a "hook" at low n C , with lower volatility in the data than predicted by the simple linear trend lines in Fig. 1 (for example, the alcohols and mono-acids for n C < 4). In Fig. 3 we show these data again, but now employing the non-linear group-contribution method described by Eq. (3). The new expression, again summarized with dashed lines, gives better agreement with the data. Specifically, the group-contribution curves (the thick dashed curves) show a low-n C hook, and the volatility of the diacids is now reproduced with good fidelity. The literature is in reasonable agreement for diacid vapor pressures at low carbon number, where our new expression agrees with the data well, while there is significant disagreement at larger n C . Here the data we have plotted are at the low end of the literature values, but our function lies closer to the middle of those values (Koponen et al., 2007). As we shall discuss below, the non-linear term is vital to producing a self-consistent (simple) theory that reproduces volatility trends but also allows mixing thermodynamics to exhibit nonideality (specifically phase separation of polar and non-polar organics). However, the qualitative behavior of those mixtures is not very sensitive to the exact value of the non-linear term. For now, however, Eq. (3) simply improves our empirical fit without too much complexity.

Defining the 2-D volatility space
The group-contribution method described by Eq. (2) allows us to determine the mean properties (specifically n C ) in the 2-D space presented in Fig. 2. The mean-property lines of 1.7 decades per oxygen are shown in black. The C 20 lines are much shallower, as O:C increases much more slowly with each added O for larger molecules. It is very important to realize that we are not using this relation to predict log 10 C o for a given n C and O:C but rather the reverse: in both laboratory experiments and field measurements we typically have a much better constraint on O:C and log 10 C o than n C , and so we are using a simple group-contribution method to infer n C (and the average molecular formula) from those observables. Likewise, in the VBS modeling framework , we use log 10 C * explicitly and are thus not directly sensitive to uncertainties in volatility.
However, we can construct a family of contours (isopleths) for constant values of n C , and n O , with some uncertainty and spread in the log 10 C o , O:C space. This is shown in Fig. 4. We can also plot n H but have omitted it for clarity -the assumption here is that H:C = 2:1 at O:C = 0 (along the bottom axis) while H:C = 1:1 at O:C = 1:1 (along the top axis). This is consistent with the van Krevelen relation discussed above (Heald et al., 2010), and it also means that we have assumed a linear relationship between O:C and the mean oxidation state of carbon, OS C , presented recently by Kroll et al. (2011) as a more fundamental indicator of oxidation in organic aerosol.
The contours in Fig. 4 interconnect descriptive, diagnostic, and prognostic aspects of the 2-D space. This paper addresses the descriptive aspect, focusing on the relationships between log 10 C o and O:C on the one hand and n C , and n O on the other. In subsequent publications we will explore the diagnostic and prognostic aspects. The diagnostic aspect allows us to evaluate the evolution of n C , n H , and n O in observational (laboratory and field) datasets where log 10 C * , O:C, and total organic mass are constrained, using the relationships established here. The prognostic aspect allows us to predict the evolution of log 10 C * , O:C, and total organic mass in box and transport models, with individual reaction steps in part constrained by the relations developed here.
The isopleths in Fig. 4 define the average composition in this 2-D space. The nonlinearity of Eq. (3) causes slightly more than a 1-decade leftward shift in the isopleths at O:C = 1:1. For example, in Fig. 2 the C 6 isopleth intersects O:C = 1:1 just below log 10 C o = −1 while in Fig. 4 it is nearly log 10 C o = −3. The broad region of our 2-D space with 10 −5 < C o < 100 µg m −3 and 1:3 < O:C < 1:1 comprises compounds with 5-12 carbons and 4-8 oxygens. Even though this is also the most uncertain region, based largely on extrapolation, the range in carbon and oxygen numbers in this region is modest. There are also some data points in this region. For example levoglucosan is an anhydrous sugar (C 6 H 10 O 5 ) frequently used as a tracer for wood burning (Simoneit et al., 2004). Knudsen-cell data from a higher temperature liquid melt (Oja and Suuberg, 1999) extrapolate to 0 < log 10 C o < 1 , which is close to the C 6 O 5 intersection in Fig. 4.
This same broad area is roughly the region thought to be occupied by oxygenated organic aerosol (OOA), as indicated by the dashed oval. The O:C for OOA is well constrained by high-resolution AMS measurements , while the volatility range can be estimated with some uncertainty based on thermodenuder measurements (Cappa and Jimenez, 2010). However, it is clear that a great deal of compounds comprising OOA have n C = 6-10 and n O = 4-8.
The n C and n O isopleths in the OOA region in Fig. 4 are extrapolations. However, we consider this to be a reasonable projection of typical values for ambient organic-aerosol compounds, for all of the reasons discussed so far. Again, we emphasize that the primary quantities -log 10 C * and O:C, can be constrained by ambient observations, and our primary application of the non-linear group-contribution expression given by Eq. (3) is as a diagnostic to ascertain the typical values for n C , n O (and n H ).

Solution thermodynamics
So far our discussion has focused only on pure compounds, but organic aerosol is a very complex mixture. Consequently, it is important to connect the single-component volatility (C o i ) to the volatility in solution (C * i,s ) we use empirically in our 1D-VBS  by estimating the activity coefficients in that solution, γ i,s . The activity coefficient is related to the excess free energy of a solute in a solvent, and so we need to consider the interaction energies of different constituents in a complex OA mixture. It is important to estimate γ i,s because it will influence the fraction of semivolatile material in the condensed phase and thus organicaerosol levels (Bowman and Melton, 2004;Bian and Bowman, 2005;Song et al., 2007;Vaden et al., 2010), but perhaps more importantly because it will control when separate condensed phases can form and thus govern the stability of different organic-aerosol "modes" such as traffic emissions There is considerable evidence that the organics comprising OA are not always a homogeneous mixture (Maria et al., 2004;McFiggans et al., 2005). This may be in part because they simply have not had time to equilibrate via gas-phase exchange (Marcolli et al., 2004), perhaps because of slow condensed-phase diffusion and mass transfer (Zobrist et al., 2008;Virtanen et al., 2010). However, it could quite plausibly be because certain OA systems phase separate into a hydrophobic "organic" phase and a hydrophilic "aqueous" phase, as has been assumed in thermodynamic models based on surrogates (Griffin et al., 2003;Chang and Pankow, 2006;Lu and Bowman, 2010;Zuend et al., 2010).
In this work we shall use θ i ≡ E i /R to express energy as an equivalent temperature. Furthermore, we shall exploit the tight relationship between the enthalpy of vaporization of organics typical of ambient OA and their saturation concentration (Epstein et al., 2010). In the Appendix we present a highly simplified treatment connecting these empirical relations to the interaction energies of an organic solution. In general, what we seek is the interaction energy of a molecule i with a solvent s, θ i,s . For pure components, as discussed so far, the term is simply the interaction of i with itself, θ i,i , and at this level of approximation an ideal solution is one in which θ ideal i,s = θ i,i . Finally, the interaction energies (temperatures) θ and the group-contribution terms b developed above are related via Eq. (4): where θ 10 is the energy change for a 1-decade change in vapor pressure. By reducing the empirical vapor pressures of organics to a two-dimensional expression in Eq. (3) and Fig. 4, we have effectively reduced the atmospheric OA to a two-component system where the "components" are carbon and oxygen "functional groups" rather than individual molecules. The average composition of the OA solvent is given by the carbon and oxygen fractions, f s C and f s O = 1 − f s C . The situation is depicted in Fig. 5.
An ansatz here is that complex mixtures in atmospheric OA are sensitive to mean solvent properties, and that we can indeed describe this average behavior usefully. We assume that it is the functionalities of the solvent and not the specific molecules of the solvent that dominate the solute-solvent interactions (as in UNIFAC). For any individual molecule the values we predict will not be accurate (unlike UNIFAC), but we hypothesize that on average they will be (with some semiempirical calibration). However, we are also claiming that predictions based on a few surrogate molecules are almost certain to be inaccurate for complex mixtures, because the unique (and therefore unrepresentative) properties of those individual surrogate molecules will likely not represent the average (bulk) properties of the mixture very well.
For a given OA component i in the bulk OA solvent s we can now find the interaction energy: In both of these expressions the cross term θ CO is prominent. However, through the relation in Eq. (4) it is evident that the linear group-contribution expression given by Eq.
(2) is equivalent to: In order to make the cross term vanish in the pure component expression Eq. (6), we find: This exactly mirrors Eq. (A5) in the Appendix, to the point that we use the same symbol θ because these terms will turn out to be identical. For both a linear group-contribution expression and an ideal solution, the cross interaction term is simply the average of the diagonal terms. This is fundamental. There is a one-to-one correspondence between a linear group contribution to volatility and ideal solution behavior. This requires that the ideal solution is defined consistentlynot in terms of the molarity of the solute but in terms of the molarity of the functionalities comprising the solute and the mean-field properties of the solvent.
The most appropriate fraction f i M is the volume fraction of the molecule i in the OA solvent s. However, we are emphasizing relationships that can be measured, especially in ambient samples, and for OA that is the molar ratio O:C (Aiken et al., 2008). Furthermore, as shown above in Eq. (5), the group-contribution expressions in Eq. (2) and (3) can be written in terms of the molar ratio O:C. Because the carbon and oxygen moieties have similar masses, this means that a linear group-contribution expression such as Eq. (2) corresponds to ideal solution behavior on a mass basis. Formulating the thermodynamic interactions with different fractions (i.e., mole, volume, mass, etc.) is in essence a transformation of the x-axis in Fig. A2 in the Appendix. We are making no effort to exactly describe activity curves in that figure for individual molecules but rather seeking to develop reasonably accurate values for bulk OA and various measurable fractions, all with unknown molecular composition. Given roughly constant density, switching between volume, mass, and group molar fractions amounts to a slight skewing of the x-axis in Fig. A2 that will fall out with empirical calibration along with the other simplifying assumptions made so far. This is consistent with the mass-fraction based solution formulation in the 1D-VBS .
If a linear group-contribution expression is equivalent to Eq. (7), a non-linear expression must be given by θ CO = θ.
We can write the solvation energy in concise form as a matrix equation, adding a non-linear cross term δθ CO to the linear term introduced in Eq. (8): or, for a single-component system, the pure, sub-cooled liquid solvation energy is given by: This is functionally identical to the non-linear groupcontribution expression presented in Eq. (3), which is why we chose that form.
Using θ 10 = 690 K/decade from the Appendix, we can write the two-dimensional interaction matrix as: As we discussed above, the constraints on volatility are stronger for low O:C, so for empirical constraints it is useful to write the non-linearity as a deviation on the θ O,O term. Our best estimate for this non-linearity is δθ CO −207 K (b CO = −0.3).

OA activity coefficients
We are now in a position to estimate the activity coefficient γ i,s for mixtures; as C * i,s = γ i,s C o i , this is a mass-based activity coefficient. In the Appendix, we show that for simple two-component systems the excess free energy can be approximately related to the difference between the interaction energy between two constituents and the average of their individual interaction energies, δθ i,s , as given by Eq. (A12) (with "i" replacing "A").
Our objective is to estimate the activity coefficient of semivolatile organics at some location in the 2-D space shown in Fig. 4, given some bulk composition of the organic aerosol acting as a solvent. As we showed in Eq. (A9) of the appendix, this depends on the non-ideal off-diagonal interaction term. This is readily determined in our homogeneous solvent. It depends on δθ CO : which, by substitution in Eq. (A8) and (4) gives At this level of approximation, the non-ideality is driven by differences between the carbon fraction (f C , or O:C) in the solvent and solute, but the activity coefficient depends exponentially on the size of the solute (n M = n C + n O ). This is a substantial simplification; however, it amounts to asserting that polarity is the most important determinant of activity coefficients in organic solutions, which is justified (Prausnitz et al., 1998;Pankow and Barsanti, 2009). At 300 K, it does not depend on the uncertain value θ 10 , but rather the empirically derived b CO . With the non-ideality known, we can then calculate the activity coefficient of the particular solute in the 2-D space using Eq. (A8). The activity coefficients that fall out of this simplified analysis are all destabilizing (larger than 1), in part because the single parameter we have (b CO ) can only work in one direction. While this is consistent with the general tendency of non-polar and polar organic solutions to phase separate, it does not account for stabilizing interactions observed in some cases. Again, we must emphasize that this is not meant to be predictive for individual molecules, but rather to represent the bulk average behavior of complex organic mixtures with compositions (specifically = O and -OH ratios) found in the atmosphere.

Activity coefficients in the 2D-VBS
The activity coefficients in Eq. (13) are designed for use in our 2D-VBS. To illustrate the consequences of non-ideality, we shall consider two fairly typical situations typified by OA with two different bulk (single-phase) compositions. First we shall consider a highly aged background aerosol with O:C = 0.75:1 (f s C = 0.57). This is consistent with highly oxidized OA characteristic of the remote atmosphere and generally called OOA (oxidized organic aerosol) Ng et al., 2009) and roughly the center of the OOA region identified in Fig. 4. Second, we shall consider a much less oxidized aerosol with O:C = 0.25:1 (f s C = 0.80). This is more consistent with very fresh emissions, possibly combining traffic emissions (known as HOA, or hydrocarbon like organic aerosol to the AMS community) and either some biomass burning OA (BBOA) or slightly oxidized firstgeneration SOA (Lanz et al., 2007;.
In Fig. 6 we show organic activity coefficients γ i,s for the full range of organic solutes dissolved in these two different bulk aerosol solvents with b CO = −0.3. The O:C of the bulk solvent is indicated on each panel. The activity coefficients of organics are contoured in these panels for all C o , O:C combinations, with contour intervals of 0.5 ranging from 1 to 10, labeled at integer values from 1 to 5.
As  13) is 0.6 · 24 · 0.0693 1, and thus γ = 10. Note that for any given solvent -solute pair, identified by f s C and f i C , the activity coefficient will depend exponentially on n M and thus increase dramatically going from right to left in the 2-D space. Because of the importance of n M the activity coefficient will also increase from top to bottom (though the f C of course varies as well). Consequently, the region to the lower left, with non-polar, large, solute molecules, is the "phase separation region".
In general, when γ i,s > 5 or 10 it is likely that a constituent will phase separate, as the activity a i will be greater than 1.0 when f i > 0.2 or 0.1. If there is any significant amount of a group of similar constituents, they will form their own phase. Once a class is less than 10 or 20% of the OA, it is of secondary concern in any event. In AMS jargon, this implies that fresh HOA will not mix with aged OOA when OOA is in the majority (this is the "phase separation region" indicated along the lower edge of Fig. 6a). Note again that this fraction is the volume fraction, approximately the mass fraction, and we are judging ideality and modeling mixing on that basis.
What Fig. 6a shows is that when the background aerosol is highly oxygenated, non-polar reduced primary emissions will have a hard time mixing into that background aerosol and will instead tend to phase separate. More accurately, the POA will tend to remain in their original phase, often in their own size mode -for example a traffic mode. The phase boundary will be somewhere near the γ = 5 or 10 contours, depending on the exact composition distribution; compounds in the uncontoured region to the lower left of that boundary (low log 10 C o , low O:C) will form a separate, "POA" phase. The transition is quite sharp and its location is not very sensitive to the exact value of the off diagonal term δθ CO . This is very consistent with much more sophisticated calculations for far simpler mixtures designed to simulate OA. For example, Jang et al. (1997) showed that larger alkanes (exclusively) showed substantial activity coefficients in model mixtures designed to simulate both woodsmoke and SOA. A typical POA constituent is indicated in Fig. 6a with a gray dot. Figure 6b shows a very different picture when POA (HOA or BBOA) is the solvent. When the bulk OA is not very oxidized almost all OA compounds can mix with the aerosol with only modest activity penalties (γ < 3). Only for semivolatile compounds just at the volatility threshold (C * i C OA ) will the small difference in volatility due to the modest non-ideality have any consequence. These select compounds will shift toward the vapor phase because C * > C o for γ > 1. For the most part, though, when the bulk OA is relatively unoxidized (low O:C), we expect it to be characterized by a single condensed phase, which will likely become progressively more oxidized as the mixture ages . In AMS jargon, OOA will mix with HOA (or especially BBOA) when HOA is in the majority. The reason for the asymmetry is that n M is much smaller in the OOA region than in the POA region, and so activity coefficients are much lower for OOA in a POA solvent than for POA in an OOA solvent.
In general, the consequence of incorporating activity coefficients (larger than 1) into any representation of OA will be to force some compounds into the vapor phase (Bowman and Karamalegos, 2002). This may become even more pronounced considering the dynamics of fresh mobile-source emissions, which can be small enough for the Kelvin term to increase the volatility even more (Zhang et al., 2004). This will be most dramatic when the systems phase separate, and the region where a separate phase may have a relatively small mass fraction is the lower left-hand portion of the 2-D space, which comprises fresh primary emissions (POA, or HOA in AMS jargon).
However, it would be a gross oversimplification to conclude that incorporation of activity coefficients into a model coupling aerosol dynamics and chemistry would result in lowered OA levels due to the increased volatility. The consequence of forcing more HOA into the gas phase (as in Fig. 6a) will almost certainly be rapid gas-phase oxidation (Shrivastava et al., 2008), generating products that will mix with the OOA phase with mass yields that may even exceed unity (Presto et al., 2009(Presto et al., , 2010. The issue is thus not whether POA can act as a seed for SOA, but rather whether SOA (really OOA) is a good seed for POA. The answer is no, but the full answer is not that there will be less OA, but rather that POA may for a time exist in a separate, fresh size mode but that it will tend to evaporate, be more rapidly oxidized in the gas phase, and generate oxidation products that will mix into SOA. Thus the major observable consequence of non-ideality may be kinetic -the transformation from POA to SOA, or HOA to OOA, will be more rapid that it would otherwise be were an ideal solution assumed.

Conclusions
We have presented a self-consistent thermodynamic foundation for a two-dimensional volatility basis set relying on a minimum number of free parameters to describe realistic and measurable properties of real-world organic-aerosol mixtures. This treatment suggests that we can reasonably describe OA thermodynamics based on average OA properties including a non-ideality term (δθ CO ). This simultaneously and self-consistently describes non-linearity in a group-contribution expression to predict saturation concentrations as well as non-ideality in the solution thermodynamics. Because δθ CO < 0, the consequences of all this will be that more molecules will find themselves in the gas phase than in an ideal world.
The formulation so far does not address the role of water (Griffin et al., 2003;Pankow and Chang, 2008;Tong et al., 2008;Barley et al., 2009;Prisle et al., 2010), or other inorganics (Clegg et al., 2008a;Surratt et al., 2008;Wang et al., 2010), but the strong correlation of O:C and OA hygroscopicity reported in  suggests that we may be able to treat water in a similar fashion. However, because water can be a dominant component, and therefore its unique properties are more likely to challenge the mean-field assumption, it is likely that explicit treatment of humidity effects will be required (Prisle et al., 2010). Any model treating organic composition in ambient aerosol is almost certain to treat inorganic constituents as well, so there is some hope that the effect of these constituents on organic partitioning may be treatable within this 2-D framework. This would require that those effects be correlated strongly with location in this space, and a test of that conjecture is beyond the scope of this treatment.
The 2D-VBS complements our well-established 1D-VBS  by adding oxygen content (O:C) as a second dimension in addition to saturation concentration (C o ). Using pairwise interaction energies we can formulate a simple expression relating carbon and oxygen numbers to O:C and C o . The formulation includes a non-linear coupling term that can be constrained by pure component vapor-pressure data that also leads to qualitatively reasonable activity coefficient predictions. This formulation relies on measurable bulk properties of organic aerosol (most notably O:C) and uses a mean-field approximation for the moleculesolvent interactions -in essence we assume that a molecule interacts more with other functional groups in the organic aerosol solution than with other molecules in that mixture. Importantly, the complex behaviors develop from a very simple formulation containing only four free parameters, all of which are constrained by vapor-pressure data of pure substances.
This is a foundation for future work. In several succeeding publications we shall explore the uses of this 2D-VBS for both diagnostic analyses and prognostic calculations of organic aerosol properties, ultimately extending the model to a fully coupled treatment of size-dependent organic-aerosol dynamics.

A1 Saturation concentrations and vaporization enthalpies
Volatility is controlled principally by the vaporization enthalpy of a substance, especially over liquids and other disordered condensed phases where the vaporization entropy is nearly constant (Prausnitz et al., 1998). We recently showed that this holds for a wide range of organics. Specifically, a pseudo-Arrhenius expression can describe the temperature dependence of C o , and there is a nearly linear relationship between log 10 C o 300 and the vaporization enthalpy (Epstein et al., 2010 we shall scale energies by the gas constant and report them in Kelvins, θ = E/R. Thus the vaporization enthalpy becomes θ vap . A key value is the change in θ driving a one decade change in C o at 300 K: θ 10 = d θ vap /d log 10 C o . We also need to know the vaporization enthalpy for C o = 1 µg m −3 : θ vap 1 µg . So, for a compound i: This means that we need to determine or specify only one of θ vap i or log 10 C o i,300 and we know the other. Our study of relevant organics suggested θ 10 1330 K (11 kJ mole −1 ) for a one decade change in volatility (Epstein et al., 2010). The simplest approximation is to assume that the entropy of solvation is similar for all relevant organics and that θ vap alone determines volatility in an Arrhenius expression, with a prefactor C A : In this case θ 10 690 K (5.7 kJ mole −1 ) because ln10 = 690/300 . In addition to the simplicity of Eq. (A3), the smaller value of θ 10 has significant empirical support because an analysis of thermodenuder data for ambient organic aerosol measurements using the higher values θ 10 showed that the system simply did not behave realistically, with different volatility bins effectively collapsing to one when the effects of heating were modeled (Cappa and Jimenez, 2010). Whatever the correct value of θ 10 , an important empirical finding is that it is strongly correlated with log 10 C o and it is independent of the degree of substitution (oxidation) of an organic molecule (Epstein et al., 2010). Consequently, we shall make the assumption that the vaporization enthalpy is simply the energy change when a given organic molecule is removed from an organic solvent.

A2 Describing vaporization energies
It is useful to start with a very basic formulation: far more detailed treatments can be found in any advanced thermodynamics text (e.g. Prausnitz et al., 1998). The steps involved in evaporating from a solvent are shown in Fig. A1. Specifically, the overall energy change includes the energy required to remove a molecule i from a solvent s (θ i,s ), but also the change in energy in the solvent when the interaction energy of the molecule with the solvent is replaced by the interaction energy of the solvent with itself (θ s,s ). This gives: Note that when the molecule is the solvent (a single component system), the term in parentheses vanishes and we are left with the simple and intuitive result of θ vap i = θ i,i . This defines C o via Eq. (A2). It is very useful to define the solvent-solute interaction energy θ i,s in terms of the average of the solutesolute (i,i) and the solvent-solvent (s,s) terms: We can substitute this into Eq. (A4) to give: We can also use Eq. (A2) to define C * when i = s, meaning the excess entropy of i is approximately zero (Prausnitz et al., 1998): log 10 C * i,s,300 = θ Given that C * = γ C o , we thus obtain: log 10 γ 300 = − 2δθ i,s θ 10 (A8) If we adopt the Arrhenius expression, Eq. (A3), then at an equivalent level of approximation the excess enthalpy (Prausnitz et al., 1998) is just 2δθ i,s , and the activity coefficient of i in s is given by a Boltzmann expression: This is, quite simply, why activity coefficients are so difficult to predict or even parameterize -an energy difference of only k T , (about 2.5 kJ mole −1 at 300 K), is enough to make the activity coefficient equal to e, and because γ is exponential it will grow (or shrink) very rapidly with increasing |δθ|. Treatments attempting to predict activity coefficients for individual molecules must carefully account for multiple thermodynamic terms (Jang et al., 1997;Nannoolal et al., 2004;Pankow and Chang, 2008); that is appropriate for those specific situations, but the treatments require detailed knowledge of the solvent composition as well as the exact molecule of interest. We have neither.

A3 Simple two-component mixtures
So far the solvent has been of unspecified composition. Now let us consider a molecule A in a solution s that is a twocomponent mixture of molecules A and B, where the fraction of A in the solution is f s A and the fraction of B is f s B = (1 − f s A ). Especially with the weakly characterized solvent we are dealing with for atmospheric aerosols, the volume fraction is most suitable here, as discussed earlier.
We shall therefore replace i with A and s with the mixture to determine the energy of solvation of A in the A-B mixture. If the mixture is homogeneous and isotropic, meaning that the local fractions (and orientations) of the molecules are the same throughout the mixture, then the interaction energy of an individual molecule A with the solvent is From Eq. (A4), we also need to know the interaction energy of the solvent with itself. This is a bit more complicated. We shall assume that the solvent "molecule" replacing A when it evaporates is a composite containing f A of molecule A and (1 − f A ) of molecule B, each of which will interact with the fractions of A and B in the solvent.
We can now calculate δθ A,s , the excess energy of A in s. In a recurring theme, it can be shown that for an ideal mixture, when θ A,B = 1/2 θ A,A + θ B,B = θ, then δθ A,s = 0. Consequently, for non-ideal mixtures with θ A,B = θ +δθ A,B : This then gives the activity coefficient via Eq. (A8) or (A9). The activity coefficient of A in an A-B mixture depends on the extent to which the cross interaction differs from the average of the pure interaction terms. The total excess energy is now a function of the solution composition, increasing quadratically as the solvent becomes increasingly dissimilar to A. We are assuming that the mixture is homogeneous and isotropic because we simply can not say more about atmospheric aerosols. Real molecules (for instance water) do not interact randomly with other molecules. However, the very complexity of ambient OA may make it more homogeneous and isotropic than simpler mixtures. The extreme example is the tendency of mixtures to form liquid or amorphous liquidlike mixtures where the individual constituents would crystalize (Cappa et al., 2008;Zobrist et al., 2008), but much more generally the complex mixtures found in atmospheric OA will smooth out the rough edges of highly specific A-B interactions.

A3.1 Non-ideal activity
It is worth exploring the activity coefficient a bit further. What we are really interested in is the activity of a compound, α i , defined relative to a pure reference state in vapor-liquid phase equilibrium and related to the activity coefficient and the fraction of A in the solution, f A . This is simply: The consequences of this are shown in Fig. A2, using Eq. (A9) for the activity coefficient. The bottom line is that non-ideal behavior emerges very quickly for quite modest values of δθ T , and that this will induce phase separation. However, the real systems we are interested in consist of very complex mixtures. Therefore, we must ask whether it is possible to predict these small deviations from ideality with anything close to sufficient accuracy.