Henry ’ s law constants of diacids and hydroxy polyacids : recommended values

In spite of the importance of diacids and functionalised diacids for organic aerosol formation through aqueous-phase processes in droplets and aerosol water, there seems to be no reliable set of experimental values for their Henry’s law constants (HLCs). We show that their estimation through the use of infinite dilution activity coefficients is also prone to error. Here we present HLC values for diacids and hydroxy polyacids determined from solubilities, water activities and vapour pressures of solids or solutions, by employing thermodynamic relationships. The vapour pressures are found to be the largest source of error, but the analysis of the obtained HLC points to inconsistencies among specific vapour pressure data sets. Although there is considerable uncertainty, the HLC defined as aqueous concentration per unit gaseous partial pressure of linear αandω-diacids appear to be higher than estimated by the often cited review work of Saxena and Hildemann (1996).


Introduction
Henry's law constant (HLC) describes the partitioning of a compound between the gas phase and a liquid, highly dilute solution. In the atmosphere, such dilute solutions, with water as solvent, can be reached in cloud droplets. Aqueous aerosols represent another example of an aqueous phase in the atmosphere. Although in the latter case this phase is not a highly dilute aqueous phase due to the large concentration of other organic and inorganic molecules, Henry's law constants for the aqueous phase are still relevant as a reference point. Polyacids are important constituents in droplets and aerosols. An often-cited HLC compilation for atmospheric purposes is that of Sander (1999); these values are also avail-able in the NIST Chemistry WebBook (Linstrom and Mallard). This compilation contains HLCs for several diacids and functionalised diacids, but most of them are not experimental but rather estimated through group contribution, taken from the review paper of Saxena and Hildemann (1996). We note that Mentel et al. (2004) did measure the HLC of glutaric acid.
In any case, HLC defined as aqueous concentration per unit gaseous partial pressure of dicarboxylic acids are likely very high, such that they will be almost completely dissolved in aqueous droplets. However, for aqueous aerosols, the total water content of the aqueous phase is much lower and the gas phase fraction could be significant. Furthermore, HLC estimation methods also need reliable HLC to fit their parameters.
Through thermodynamics, vapour pressures of liquids or solids, solubilities and activity coefficients are all related. The focus in this work is on HLC, but we will need the other quantities as well. Therefore, we first briefly review their thermodynamic relationships. For HLC, several definitions exist. We will follow here the convention taken by Sander (1999): with c s the molar concentration of the solute, s, in the aqueous solution, p s its partial pressure above it, x s the mole fraction in the aqueous phase and c w the molar concentration of pure water (55.6 M). As expressed by Eq.
(1), they should in principle be measured at the infinite dilution limit (IDL).
For an ideal solution where there is no difference in interaction between like or unlike molecules the partial pressure would equal p s = x s p 0 L,s (Raoult's law), with x s the mole fraction of the solute and p 0 L,s its liquid saturation vapour pressure. However, in general, solutions are not ideal, the solute molecules behave differently in water than in a liquid of pure s, and therefore an activity coefficient correction is needed.
γ s is the mole fraction based activity coefficient. It expresses the preference of the solute to the mixture, compared to the pure component reference state at the same temperature and pressure. Note that in Eq.
(2) it is assumed that the gas phase of the solute behaves ideally, a reasonable assumption given that the partial pressure of the solute will be small. If the solvent is the same as the solute, γ s = 1, while a value below (above) unity means that the solute prefers the mixture (solvent of pure s). Combination of Eqs. (1) and (2) leads to with γ ∞ s the infinite dilution activity coefficient (IDAC) and p 0 L,s the liquid saturation vapour pressure. Suppose now that the solute is a crystalline solid in pure form. Above the solid, gas phase molecules with a solid state vapour pressure p 0 Cr,s will be present. Now if the solute is added to water beyond its solubility limit, a solid phase will form in the aqueous phase. If it can be assumed that this solid is the same as in the dry form (e.g. there is no incorporation of water in the crystal structure), one has with x sat s the mole fraction solubility of the solid and γ sat s the activity coefficient at this point. In principle, a compound's pure liquid and solid state can only coexist at the melting curve, where p 0 Cr = p 0 L . Hence if the substance is solid at the temperature of interest, the vapour pressure of the subcooled liquid is inaccessible from a thermodynamic point of view, as x s cannot increase above x sat s . In practice, however, metastable, strongly supersaturated solutions might exist in small particles (Peng et al., 2001;Soonsin et al., 2010;Huisman et al., 2013), such that p s → p 0 L,s can be approached. p 0 L,s can also be related to p 0 Cr,s , as sublimation can be seen thermodynamically as first a melting of the solid, and then a vaporization of the resulting liquid. The fusion enthalpy and entropy have then to be taken into account: Unfortunately, H fus and S fus can only be measured at the fusion point, where H fus = T fus S fus , which is in our case often far above the temperature of interest (around room temperature). Thus, extrapolation schemes are necessary, e.g. by assuming a constant heat capacity difference, C p,ls , be-tween liquid and solid: C p,ls is generally not available and has to be estimated, e.g. by neglecting it or imposing C p,ls ≈ S fus (T fus ). This is the approach taken by e.g. Booth et al. (2010).
If the solubility is low, such that x sat s is close to the IDL, one has In this work, we will first investigate the reliability of HLC values of diacids present in the literature. Then we will evaluate the reliability of UNIFAC (UNIQUAC Functional-group Activity Coefficient)-type group contribution methods to obtain IDAC, as the IDAC is one ingredient, apart from the subcooled liquid vapour pressure p 0 L , to obtain HLC via Eq. (3). Next we will show how more reliable HLC values can be constructed using thermodynamic relationships and existing experimental data on solid state vapour pressures, solubilities and water activities. Vapour pressures of solids or solutions bear the largest uncertainty, with considerable disagreement between the results of different research groups (e.g. Booth et al., 2010;Soonsin et al., 2010;Cappa et al., 2007;Chattopadhyay and Ziemann, 2005). However, comparing the thus-derived HLC between different molecules will reveal clues as to the consistency of the vapour pressure data sets.

Case study on the reliability of compiled literature values for diacids
HLC estimation methods need of course experimental data to fit and/or test their models. Some studies (e.g. Raventos-Duran et al., 2010;Hilal et al., 2008;Modarresi et al., 2007) report also estimated vs. experimental HLC values for diacids. The origin and reliability of the experimental values is then of prime importance, as any estimation method can only be as good as the data on which it is based on. We choose here the data set compiled by Raventos-Duran et al. (2010) to fit the model GROMHE, but similar conclusions would be very likely drawn for other compilations. GROMHE is developed especially for atmospherically relevant compounds. It is based on a compilation which includes data for five diacids: oxalic, malonic, succinic, glutaric and adipic acids. Their HLC values as used by Raventos-Duran et al. (2010) are presented in Table 1. The HLC of oxalic acid was taken from Gaffney et al. (1987), however this article refers to a conference abstract (Gaffney and Senum, 1984) we could not obtain, and it is therefore unclear how the value was originally obtained.  Gaffney et al. (1987) b Meylan and Howard (2000) c Hilal et al. (2008) Furthermore, Gaffney et al. (1987) state that this HLC is an effective one at pH = 4, while the basis set of the GROMHE method should consist of intrinsic HLCs according to Raventos-Duran et al. (2010). For diacids, the relation between effective and intrinsic HLC is Given that oxalic acid is a quite strong acid, with dissociation constants of K a,1 = 5.18 × 10 −2 , K a,2 = 5.30 × 10 −5 (Apelblat, 2002), it will be mostly in ionised form at this pH, and k eff h /k intr h ≈ 800. The HLC of glutaric acid was taken from another compilation (Hilal et al., 2008). From there, one can follow the trace via the compilation of Modarresi et al. (2007) to the compilation of Sander (1999) and finally to the original source, Saxena and Hildemann (1996), which makes clear that this value is not experimental but rather estimated through a groupcontribution method. This notion was explicitly mentioned in the compilation of Sander (1999), but was lost in the compilations of Modarresi et al. (2007); Hilal et al. (2008), and Raventos-Duran et al. (2010) directly or indirectly referring to it. We emphasise here that these last three compilations were made explicitly to develop and/or test HLC estimation methods; hence, it is required that the compiled HLC values are all experimental rather than estimated! The HLCs of the other three diacids (malonic, succinic, and adipic acids) are all taken from the data compilation available in the EPI suite software (Meylan and Howard, 2000). Checking this compilation, it turns out that in all cases, the HLC does not refer to a directly measured value, but rather to an estimation obtained by combining solubility c sat s of the solid diacid and solid state vapour pressure p 0 Cr , using Eq. (7). This is not necessarily a problem; Eq. (7) is correct provided the solubility is low, such that one is close to the IDL. Of course both c sat s and p 0 Cr should be reliable to obtain a reliable k h . The solubilities c sat s are all from the AQUASOL database (Yalkowsky and Dannenfelser, 1992) and they reasonably agree with the values from primary references (Apelblat andManzurola, 1987, 1989;Marcolli et al., 2004). However, the solubility mole fraction x sat of malonic acid is about 0.22 (Apelblat and Manzurola, 1987), which is far from IDL, hence Eq. (7) does not hold. The x sat of succinic and adipic acids are much lower (0.013 and 0.003 respectively), hence Eq. (7) should be reasonably valid. However, for all these three diacids, the values for p 0 Cr are questionable. The values for succinic and adipic acids are from the handbook of Yaws (1994). In this compilation, the vapour pressures of these two acids refer to the liquid phase; the vapour pressures at 25 • C present in the compilation of Meylan and Howard (2000) are obtained by first extrapolating liquid vapour pressures over 157 and 128 K below their melting points respectively, and then converting to solid state vapour pressures. The exact procedure of this conversion is not clear to us, but the reported values p 0 Cr seem consistent with a simple approximate procedure like that of Yalkowsky (1979) where only the fusion temperature is from experiments, rather than the more precise procedure where both fusion temperature and enthalpy are taken from experiments. Finally, for malonic acid, the vapour pressure originates from the handbook of Jordan (1954). The cited vapour pressure (0.2 Pa) is however orders of magnitude higher than the ones from recent experiments (10 −4 -10 −3 Pa, see e.g. the overview table of Soonsin et al. (2010)).
Given that strong reservations can be made for each of the HLC values in Table 1, the need for a reliable HLC set for diacids is clear.

Infinite dilution activity coefficients: usefulness in obtaining HLC
Following Eq. (3), HLC can be obtained by knowledge of the liquid vapour pressure, p 0 L , and the IDAC γ ∞ s . Note that diacids and hydroxy polyacids are solid at room temperature, whereas p 0 L and γ ∞ s are required for the subcooled liquid. Let us disregard the problem of obtaining a subcooled vapour pressure, and focus here on obtaining a reliable value of γ ∞ s . For the binary acid-water mixtures considered in this work, activity data is mostly restricted to the water component. In principle, this is not a problem, as it follows from the Gibbs-Duhem relation that for a binary mixture, knowledge of the activity coefficient of one component results in knowledge of the activity coefficient of the other component. The Gibbs-Duhem relation in its derivative and integral form is expressed as (see e.g. Mansoori, 1980) x where x w and x s denote the mole fraction of water and solute respectively, and γ w and γ s are the corresponding activity coefficients. If the γ ∞ s is desired, Eq. (10) reduces to hence, to obtain the IDAC of the solute, any functional form of γ w (t) should match the experimental water activity coefficients over the entire concentration range. Activity coefficients can be estimated by fitting an activity coefficient expression (e.g. Margules, Van Laar, Wilson, UNIQUAC; see Prausnitz et al., 1999;Carlson and Colburn, 1942) to activity coefficient data of a particular binary system. γ ∞ s can be derived from the parameters of this mixture-specific model. Another way to obtain the activity coefficient is through the use of group-contribution methods. UNIFAC (Fredenslund et al., 1975;Hansen et al., 1991) is arguably the most popular activity coefficient estimation method based on this group-contribution concept. While a mixture-specific model will generally perform better than a group-contribution model for that specific mixture, its use is limited to that binary mixture. A group-contribution method like UNIFAC, on the other hand, can be used to predict the activity coefficients of more complex mixtures, including molecules for which no experimental data is available. Peng et al. (2001) and Raatikainen and Laaksonen (2005) provided new UNIFAC parameterisations (called UNIFAC-Peng and UNIFAC-Raatikainen hereafter) using activity and/or solubility data of mixtures with water and diacids or functionalised diacids. A close relative of UNIFAC-Peng is AIOMFAC (Aerosol Inorganic-Organic Mixtures Functional-group Activity Coefficient) (Zuend et al., 2011), as it inherited some of its parameters, while other parameters were inherited from the UNIFAC parameterisation of Marcolli and Peter (2005) (UNIFAC-MP). UNIFAC-MP was adapted to better describe monoalcohols and polyols. It should be mentioned that AIOMFAC has the widest scope of the aforementioned models; it can also describe organicinorganic and water-inorganic interactions, and is therefore extremely useful for aqueous systems containing both salts and organics.
One would expect that these three models (UNIFAC-Peng, UNIFAC-Raatikainen, AIOMFAC) would give similar IDACs for diacids and hydroxy polyacids, as they were based on experimental data for these molecules. The estimated IDACs are compared in Fig. 1.

Hydroxy polyacids
Unexpectedly, large discrepancies, of up to one order of magnitude, show up for the IDAC of malic, tartaric and citric acids as calculated by the various methods. AIOMFAC predicts a higher IDAC than UNIFAC-Peng for all three molecules. UNIFAC-Raatikainen gives IDAC values quite expressed as (see e.g. Mansoori, 1980) x where x w ,x s denote the mole fraction of water and solu corresponding activity coefficients. If the IDAC γ ∞ s is des close to UNIFAC-Peng for malic and citric acids, but for tartaric acid it predicts an IDAC an order of magnitude lower. Figure 2 shows experimental ln γ w data for the three hydroxy polyacids, mixture-specific fittings and predictions through group-contribution methods. The data in the supersaturation range is from Peng et al. (2001), obtained by electrodynamic balance measurements on particles. Measurements on subsaturated bulk solutions is from several data sources (see Appendix A for an overview). It is clear that the particle data is more scattered and coarse than the bulk data. Some observations that can be made from these plots are as follows: -for malic and tartaric acids, water activity in the supersaturation range is in the order AIOMFAC > UNIFAC-Peng > experimental; -for tartaric acid, UNIFAC-Peng and UNIFAC-Raatikainen predict a quite different IDAC ln γ ∞ s (Fig. 1). Still, their standard deviation (SD) vs. the experimental data is similar in ln γ w . Close inspection reveals that UNIFAC-Peng matches best the bulk ln γ w data in the subsaturation range (see Appendix A).
AIOMFAC returns a higher ln γ w than UNIFAC-Peng over the entire concentration range for all three hydroxy acids (see Fig. 2). From Eq. (11), this explains the systematically higher IDAC predicted by AIOMFAC compared to UNIFAC-Peng. UNIFAC-Peng shows a lower SD vs. the experimental data compared to AIOMFAC. The reason for the discrepancy between UNIFAC-Peng and AIOMFAC can be attributed to the fact that AIOMFAC's hydroxy-water interaction parameters are from UNIFAC-MP, which was developed for monoalcohols and polyols but not for hydroxy acids. Note however Fig. 2. Logarithm of water activity coefficient at 25 • C in function of mole fraction water, for malic, tartaric and citric acids. Particle experimental data is from Peng et al. (2001), bulk experimental data is from several sources (see Appendix A). Fitted activity coefficient expressions (Van Laar, UNIQUAC) are also included, as well as estimations by UNIFAC-Peng, AIOMFAC and UNIFAC-Raatikainen. The standard deviation vs. the experimental data is also given. that in the subsaturation range (see Appendix A) AIOMFAC matches better the experimental data than UNIFAC-Peng for tartaric acid, while UNIFAC-Peng matches better for malic and citric acids.
Although UNIFAC-Peng has the lowest SD in ln γ w for the three acids, it overestimates the ln γ w data in the supersaturation region for malic and tartaric acids. Therefore, we fitted the data with the commonly used Margules, Van Laar, Wilson and UNIQUAC ( UNIversal QUAsiChemical) activity coefficient expressions (e.g. Prausnitz et al., 1999;Carlson and Colburn, 1942, see also Appendix A). In Fig. 2 we present the most successful and unique fittings (e.g. for these particular cases, the Margules function gave results very close to the Van Laar function). The resulting parameter set is then used to obtain the solute γ ∞ s . All fittings extrapolate to a lower γ ∞ w than the group-contribution methods, and different solute γ ∞ s are obtained: 0.01 for malic acid, 3 × 10 −4 for tartaric acid, and 5 × 10 −3 or 1 × 10 −2 for citric acid if the Van Laar fitting and, respectively, the UNIQUAC fitting is used.
However, as the data in the supersaturation region is scattered and coarse, these fittings are not well constrained. This is shown clearly for citric acid, where the Van Laar and the UNIQUAC fittings have a comparable SD, but are quite different in the supersaturation region. We conclude therefore that these IDAC estimations are not an optimal basis to derived HLCs, even if reliable, subcooled liquid vapour pressures were available.

Linear diacids
For linear diacids, UNIFAC-Peng and AIOMFAC become identical (Fig. 1). UNIFAC-Raatikainen gives an only slightly higher IDAC for the longer chain diacids. However, this does not guarantee that they agree with experiment. Before proceeding further, let us first consider the peculiarities of linear diacid solubilities in more detail.
It is well known that several properties of linear diacids, such as melting point, fusion enthalpy, solubility and solid state vapour pressure, follow an even-odd alternation pattern with the number of carbon atoms in the chain. This is caused by the more stable crystal structure of linear diacids with an even number of carbon atoms (Thalladi et al., 2000). In the case of solubility, this leads to a lower solubility of the diacids with an even number of carbon atoms (Fig. 3). On top of the even-odd alternation pattern, the solubility decreases with the number of carbon atoms. One can view the dissolution of a solute in a solvent as first a melting process and second a mixing process.
The ideal solubility model (Yalkowsky and Wu, 2010) assumes ideal mixing, such that the finite solubility is only caused by the melting process: where we used approximation (Eq. 7) and assumed a zero C p . The fusion data was obtained from Booth et al. (2010), and Roux et al. (2005), and (if applicable) the sum over different solid-solid transition points was taken. From Fig. 3, it is clear that apart from the even-odd alternation pattern, there is only a small dependence of the ideal solubility with chain length. Therefore, the lowering of solubility with chain length must be due to a more difficult mixing, or equivalently an increase in activity coefficient. We showed previously (Compernolle et al., 2011) that for the longer chain diacids (starting from C7) UNIFAC-Peng and UNIFAC-Raatikainen underestimate γ sat s , which should be close to γ ∞ s for these low-soluble acids. These longer chain molecules were not in the data set used to develop UNIFAC-Peng or UNIFAC-Raatikainen, and this can explain the lower performance of both methods for these compounds.
A deeper insight in the driving factors behind the solubility of linear diacids can be achieved by comparing fusion enthalpies and entropies with solution enthalpies and entropies close to IDL, as done in Fig. 4. The solution enthalpy and entropy for the low-solubility diacids succinic, adipic, suberic and azelaic acids was derived by fitting Van't Hoff equations to the temperature-dependent solubility data of Apelblat and Manzurola (1987); Yu et al. (2012), and Apelblat and Manzurola (1990). Note that for suberic acid, we chose the data of Yu et al. (2012) over that of Apelblat and Manzurola (1990) as the solubility varied more continuously with tem- perature. Solubility data for pimelic acid is also available (Apelblat and Manzurola, 1989) but the solubility vs. temperature curve is quite irregular; this could mean that several solid-solid transitions take place (e.g. due to uptake of water in the crystal). We therefore omitted the pimelic acid data. Malonic and glutaric are highly soluble in water, hence their solution enthalpy and entropy derived from solubility would be far from IDL. Instead, we took the solution enthalpies derived from caloric measurements at low concentrations (Taniewska-Osinska et al., 1990). Given that the H fus and S fus are obtained at a higher temperature than H sol and S sol , it is not fully justified to simply take their differences to obtain H mix and S mix , but for qualitative purposes it will probably suffice. One notices that from glutaric acid on, H sol − H fus ≈ H mix gradually increases; meaning that the energetic interactions acidwater become less strong compared to the acid-acid interactions in the pure melt. This is unfavourable for the solution process and is one reason why x sat s decreases with chain length. Furthermore, one notices that the entropy of mixing S sol − S fus ≈ S mix decreases, especially for the longer chain molecules, again causing x sat s to decrease. For azelaic acid, this entropic effect has become the dominant contribution to the low solubility.
Such a strongly negative S mix is typical for dissolution of hydrophobic molecules in water. The presence of the hydrophobic chain causes the water molecules to reorder themselves, resulting in an entropy decrease. There are many examples of this effect in the HLC compilation of Abraham et al. (1990), for example for the series of linear 1-alkanols. For linear α-and ω-diacids, and presumably also for linear α-and ω-diols, with both tails hydrogen bonding, it takes a longer chain before this hydrophobic effect becomes important. SD(log 10 (γ ∞ s /γ sat s )) g 0.10 0.14 0.63 a The effect of acid dissociation is significant for oxalic acid. Clegg and Seinfeld (2006a) calculated the activity of the undissociated acid and concluded that, within the data uncertainty, Raoult's law could be assumed. For the other acids the dissociation is a minor effect. b Apelblat and Manzurola (1987). c Apelblat and Manzurola (1989). d Bretti et al. (2006). e The activity calculator for dicarboxylic acid solutions available at the E-AIM website (Clegg and Seinfeld, 2006a, b) was used, taking explicit dissociation into account and considering the activity coefficient of the undissociated acid. f Peng: UNIFAC-Peng. Raatikainen: UNIFAC-Raatikainen. g Standard deviation in log 10 (γ ∞ s /γ sat s ) of the group-contribution method vs. the values derived in this work.

HLC data: results
From Eq. (7) it follows that HLC can be derived from solubility and solid state vapour pressure data, provided the solubility is low enough such that the IDL is a good approximation. But even if the compounds are quite water soluble, as is the case for the short-chained linear diacids and the hydroxy polyacids, one can still derive the HLC. Indeed, the combination of Eqs. (3) and (4) leads to The ratio γ sat s /γ ∞ s can be retrieved if sufficient water activity data is available in the concentration range The important point is that the integral no longer involves the supersaturation region, where data is typically less precise and/or scarce. Solubility data x sat s (see Table 2) was taken from Apelblat and Manzurola (1987, 1989, and Bretti et al. (2006), and was consistent with other solubility data (Marcolli et al., 2004). Functional expressions for ln γ w (t) were fitted using only subsaturation water activity data; the details are given in Appendix A and a summary of the results in Table 2. γ ∞ s /γ sat s estimations using UNIFAC-Peng, AIOMFAC and UNIFAC-Raatikainen are also presented, as well as estimations from the activity calculator for aqueous dicarboxylic acid solutions available at the E-AIM website (http://www.aim.env.uea.ac.uk/aim/accent2/) and described by Clegg and Seinfeld (2006a, b). Similarly to this work (see Appendix A) this activity calculator consists of models individually fitted to specific diacid-water systems. The resulting γ ∞ s /γ sat s estimations are therefore very close to our work. Among the group-contribution methods, UNIFAC-Peng gives results closest to our work. UNIFAC-Raatikainen deviates the most, and gives an exceptionally low value in the case of tartaric acid.
For the longer linear chain diacids (C6 and higher), we did not find water activity data in the subsaturation range in the literature. In most cases, their solubility is low enough, such that γ ∞ s /γ sat s ≈ 1 can be assumed. This was confirmed by γ ∞ s /γ sat s using UNIFAC or AIOMFAC. A somewhat higher value is predicted only for pimelic acid.
To derive the HLC values, the γ ∞ s /γ sat s derived in this work are used for the linear diacids C3-C5 and the hydroxy polyacids. For oxalic acid γ ∞ s /γ sat s = 1 is assumed following Clegg and Seinfeld (2006a). For pimelic acid, the value  Cr data measured at relatively high T : from 318-358 K for succinic acid to 353-385 K for sebacic acid. c p 0 Cr data measured at relatively high T : from 339-357 K for malonic acid to 367-377 K for azelaic acid. d The p 0 Cr value of 1.0 × 10 8 Pa in Table 1 of Cappa et al. (2007) is likely a typo. Comparing with H sub and S sub in the same table reveals that the value 1.0 × 10 7 Pa should be taken. e This value is not derived from p 0 Cr data but is directly measured. f This value is not derived from p 0 Cr data, but from p 0 L data, using Eq. 3 and γ ∞ s estimated in Section 3.1. The spread originates from the uncertainty in Huisman et al. (2013)'s p 0 L data. The true uncertainty will be higher due to uncertainty in γ ∞ s . g Estimated using Eqs. (5), (7) and (14), with fusion properties taken from Booth et al. (2010) and assuming C p,ls ≈ 0. h As g, but assuming C p,ls ≈ S fus (T fus ). estimated by UNIFAC-Peng is used. For the other linear diacids (C6,C8-C10) γ ∞ s /γ sat s = 1 is adopted given their low solubility.
In Table 3, the HLC data at 25 • C derived from Eq. (15) or Eq. (7) are presented. As especially the solid state vapour pressures disagree between different sources, we grouped the data in Table 3 per solid state vapour pressure reference. Where possible, also the enthalpy of gas-phase dissolution is given, calculated as H g→aq = H sol − H sub . Huisman et al. (2013) provide liquid phase vapour pressures rather than solid state vapour pressures for tartaric and citric acids. In a first approach, we applied Eq. (3), using the IDACs from the fittings discussed in Sect.3. In a second approach, k h was estimated combining Eqs. (5), (7) and (14), using fusion properties taken from Booth et al. (2010) and assuming C p,ls ≈ 0 or C p,ls ≈ S fus (T fus ). Both approaches return reasonably consistent results, although for tartaric acid, the result of the second approach depends strongly on the assumption for C p,ls .

Consistency of solid state vapour pressure data
If other homologous series (linear alkanes, acids, 1-alkanols, 2-ketones, etc.) are any guide (Sander, 1999), one would expect a rather slow variation of the HLC of linear diacids with chain length compared to e.g. liquid vapour pressure p 0 L . For example, when going from acetic to hexanoic acid, HLC at 25 • C is lowered by a factor 4, while p 0 L is lowered roughly by a factor 400. Also, no even-odd alternation of k h or gas dissolution enthalpy H g→aq with chain length is expected, as this is a peculiarity for properties involving the crystalline phase. Figure 5 presents the HLC and H g→aq of the linear diacids vs. carbon number, grouped per reference of solid state vapour pressure. The large variation in k h reflects the variation in p 0 Cr from different data sources. Some of the lowest k h and, in absolute value, H g→aq are found for Salo et al. (2010), especially for the longer chains C8-C10 where k h lowers rapidly with chain length. This is likely due to samples that are not purely crystalline, a possibility acknowledged by these authors. For pimelic acid Salo et al.
(2010) could distinguish two modes and they attributed the one with the lowest p 0 to the crystalline phase. This is probably correct, as for this acid the derived HLC and H g→aq are more comparable to these derived from p 0 Cr data of other authors. To a smaller extent, also the HLC derived from the Chattopadhyay and Ziemann (2005) data lowers rapidly from C7 on. The dissolution enthalpies derived from the data of Booth et al. (2010) and of Bilde et al. (2003) exhibit a strong even-odd alternation -although in reverse directions -and contrary to expectation. This could be an indication of experimental artefacts in the measurement of H sub in these works. Also the HLC data derived from Bilde et al. (2003)   exhibit an even-odd alternation. The HLC data derived from Ribeiro da Silva et al. (1999Silva et al. ( , 2001 and Cappa et al. (2007Cappa et al. ( , 2008 exhibit the smallest variation with chain length, more in line with the expectation. We recommend the HLC derived from the data of Cappa et al. (2007Cappa et al. ( , 2008, as their measurement was closer to room temperature compared to that from Ribeiro da Silva et al. (1999Silva et al. ( , 2001. The HLC derived from the p 0 Cr measurements on saturated solutions of Soonsin et al. (2010) are also recommended; these authors make a convincing case that these are preferable over p 0 Cr measurements on the solid state.
Apart from diacids, Booth et al. (2010) also presented p 0 Cr data on hydroxy polyacids (malic, tartaric and citric acids). Using fusion enthalpy data, these data were then converted to subcooled liquid p 0 L . From these data, it followed that p 0 L (tartaric) > p 0 L (succinic), and p 0 L (citric) > p 0 L (adipic), which is counterintuitive, as one expects generally a lower p 0 L with increasing number of polar groups. However, one could argue that for molecules with many functional groups, it is difficult for the molecules to get optimal intermolecular bonding for all functional groups at once. Comparing k h instead of p 0 L can provide a more stringent test; the small water molecules should more easily interact with all functional groups at once. From Table 3, one finds that k h (tartaric) > k h (succinic) and k h (citric) > k h (adipic), which seems to be at least qualitatively correct. We recommend however the HLC derived from the data of Huisman et al. (2013). In this work, the same technique is used as Soonsin et al. (2010) used for linear diacids, and the expected order p 0 L (tartaric) < p 0 L (succinic), and p 0 L (citric) < p 0 L (adipic) is preserved. The derived HLCs are about 6-7 orders of magnitude higher than those derived from the Booth et al. (2010) data.

Atmospheric implications
Notwithstanding the high variations in the derived HLC of the linear diacids, they are most often higher than the estima-tions provided by the review work of Saxena and Hildemann (1996). For clouds, the liquid water content (LWC) varies between 0.1 and 1 g m −3 , and for aqueous aerosols between 10 −6 and 10 −4 g m −3 (Ervens et al., 2011). If partitioning between gas and aqueous phase is governed solely by Henry's law, the aqueous phase fraction, f aq , of a species can be calculated from with ρ w as the water density. For clouds, k * is between 4 × 10 4 and 4 × 10 5 M atm −1 . For oxalic acid, the lowest k h value from Table 3 is 6.0 × 10 6 M atm −1 , leading to f aq between 0.94 and 0.993. Taking also the acid dissociation of oxalic acid into account at a typical pH of 4 (Eq. 8), f aq is above 0.9999. The other k h values for oxalic acid from Table 3 are about two orders of magnitude higher, leading to an even more complete dissolution. For the other species in Table 3, k h varies between 10 8 and 10 11 M atm −1 (provided one dismisses the lowest values from Salo et al. (2010) corresponding probably to non-purely crystalline samples) orders of magnitude higher than k * . Hence, for clouds, the diacids and hydroxy polyacids should reside almost completely in the aqueous phase. For aqueous aerosols, k * is typically between 4 × 10 8 and 4×10 10 M atm −1 , which is in the range of k h values from Table 3 for linear diacids. To the extent that the HLCs reported here are applicable, one can conclude that for linear diacids significant partitioning to the aqueous phase or the gas phase are both possible, depending on the species and the LWC. However, an aqueous aerosol is not a dilute aqueous solution, but on the contrary a concentrated solution containing both organics and inorganics. Therefore, in a more rigorous treatment, an activity coefficient model (e.g. AIOMFAC, Zuend et al., 2011) should be used, provided the mixture composition is known.  (1971), a fitting with the Wilson function finally results in γ ∞ s /γ sat s = 1.2. Glutaric acid. The data of Wise (2003) are very scattered and are therefore not used. The other data show clearly that a w > x w . After fitting with the Van Laar formula, an activity coefficient ratio of γ ∞ s /γ sat s = 3.1 is obtained. Adipic acid. The solubility of adipic acid is very low, such that γ ∞ s /γ sat s ≈ 1 can be anticipated. This is confirmed by the data point of Marcolli et al. (2004), where a w ≈ x w at the solubility limit.

A4 Hydroxy polyacids
For malic, tartaric and citric acids, one has a w ≤ x w (Fig. A2). Note that only malic acid was considered by Clegg and Seinfeld (2006a).
Malic acid. We selected all the data of Davies and Thomas (1956); Carlo (1971); Apelblat et al. (1995a), and Robinson et al. (1942), while from the data of Wise (2003); Peng et al. (2001); Velezmoro and Meirelles (1998) we selected only the a w ≤ 0.95 points. The data of Maffia and Meirelles (2001) was excluded as the a w data was lower than for the other data sources. Fitting with the Margules function resulted in γ ∞ s /γ sat s = 0.52. Tartaric acid. We selected all the data of Apelblat et al. (1995a), and Robinson et al. (1942). We selected only the a w ≤ 0.95 points for the Maffia and Meirelles (2001) data and the a w ≤ 0.97 points for the Velezmoro and Meirelles (1998) data. The data of Velezmoro and Meirelles (1998) was excluded as the a w data was lower than for the other data sources.
Citric acid. The data of the different data sources (Levien, 1955;Peng et al., 2001;Maffia and Meirelles, 2001;Velezmoro and Meirelles, 1998) are in good agreement with each other. We only excluded the data points of Velezmoro and Meirelles (1998)