Introduction
Organic aerosols are complex mixtures of thousands of different types of
compounds that vary in structure and physicochemical properties. This
diversity poses challenges for comprehensive characterization, even while
estimates of overall mass abundance and its contributing factors are still
desirable. Functional group (FG) analysis is an approach that presents a
level of characterization that provides a bridge between full molecular
speciation, which is useful for precisely tracking specific classes of
physical and chemical transformations,
and elemental composition, which is useful for mass closure analysis.
FGs are structural units in molecules that describe important condensed-phase
interactions that contribute to properties like volatility and
hygroscopicity, and
FG analysis provides information useful for overall organic mass quantification
and its apportionment by source class in past studies e.g.,.
FGs are also central to understanding reactivity and resulting chemical
transformations, and their characterization by measurement and in model
simulation can provide a method of evaluating our understanding of
functionalization (i.e., through bonding with heteroatoms) in organic aerosol
mixtures. However, studies on this topic have thus far been very limited on
account of challenges in quantitative characterization of FGs, which requires
either advanced algorithms e.g., for spectral interpretation or derivitization steps
e.g., for chemical analysis. In
anticipation of continued progress in analytical technology,
and introduced a method for
harvesting FG information from molecularly speciated measurements (e.g., gas
chromatography–mass spectrometry, GC-MS; ) and chemically
explicit model simulation (e.g., Master Chemical Mechanism, MCMv3.2;
).
In this study, we build upon the work by to further
improve our capability for model–measurement intercomparison using FG
analysis. compared changes in relative molar abundances
of FGs in chamber experiments measured by Fourier transform infrared (FT-IR)
spectroscopy against composition simulated with a chemically explicit
gas-phase reaction mechanism coupled to a gas–particle (G/P) partitioning
module. As molar FG composition is directly obtained from measured FT-IR
absorbances, this is a sensible metric used to track changes in chemical
composition and has been used in other studies e.g.,.
However, estimating FG contributions to
carbon-centric metrics more commonly used to characterize organic aerosol
oxidation or mass yields, such as organic carbon (OC) and organic matter (OM)
mass, OM / OC mass ratios, atomic ratios, and mean carbon oxidation state
is not straightforward.
Central to this task is understanding which fraction of carbon atoms is
“detected” by measurement of any given set of FGs, and estimating the overall
carbon abundance from FGs without multiply counting the polyfunctional carbon
atoms.
Some of these metrics have been calculated from FT-IR measurements by
previous researchers based on assumptions regarding the underlying molecular
structure e.g.,. For instance, assumed bonding
configurations in secondary organic aerosol (SOA) products to be consistent
to the parent volatile organic compound (VOC) to estimate the carbon content
from measured FG abundance. also use the number of carbon
atoms in the parent VOC to normalize FG concentrations reported for SOA
mixtures. introduced a functional group index (FGI) to
conceptualize how OM / OC ratios varies according to chain length and
functionalization for specific sets of compound classes, and provided an
evaluation from mass spectrometry measurements that comprised up to 10 %
of the total OM mass. Using results from numerical simulation of SOA
formation, we now describe methods for estimating carbon content based on
molecular parameters that describe the underlying mixture composition
consisting of a diverse set of polyfunctional compounds, and a means of
examining dependence of carbon-centric metrics on composition without
invoking knowledge about molecular chain lengths, which is not well
characterized by FG analysis. The benefit of developing a systematic approach
is that we can precisely understand the achievable mass recovery, as well as biases
incurred on the calculated O / C and OM / OC for a given set of
molecules and FGs analyzed (when chemical extraction is not required, OM mass
recovery is primarily dependent on the completeness of FG calibration models
constructed). These estimates may then be used to propose mixture-specific
adjustments to facilitate more direct intercomparisons with other data. This
work will focus on FG abundances obtained by FT-IR measurements, but many
aspects are generalizable to other types of FG analysis
e.g.,.
The objective described above is addressed in this work by
(1) conceptualizing SOA as a collection of carbon atoms that are
functionalized in different ways and (2) the FT-IR as a tool that measures
some subset of such functionalized carbon structures. These “carbon types” gives rise to observed FGs in measurement and can be used to calculate the OM properties described above. Carbon type representation of complex mixtures
has a strong precedent in the study of organic chemistry in the atmosphere.
For example, the carbon bond mechanism defines chemical
reaction schemes according to reactivity of carbon atoms classified according
to functionality, without regard to membership in a molecule. The “carbon
vector” in GECKO-A is a description of functionalized
carbon types and retains information regarding transformations in
functionalization (while a separate connectivity matrix tracks transformation
in the carbon skeleton upon accretion or fragmentation). In the commonly used
volatility basis set (VBS), changes in carbon mass are conserved according to
functionalization by oxygen, nitrogen, or overall carbon oxidation state
. Quantitative analysis
of additional “groups” that describe the underlying skeletal (e.g., ring,
aromatic, or unsaturated) structures that change with fragmentation and
accretion reactions have not been sufficiently advanced by
FG analysis to provide complete estimates of mean molecular size and other
aerosol properties that govern volatility and solubility .
However, past precedents mentioned above indicate that classification of
carbon atoms according to extent of functionalization may have merit in
harmonizing observations with model representations for calculating common
mixture characteristics of OM.
In this work, we illustrate how measured FGs can be related to properties of
various carbon types comprising a diverse set of polyfunctional molecules. We
use the proposed relationships to determine which carbon types are measured
according to FGs included in calibration models and biases resulting from
partial analysis of the different carbon types in the mixture. For
illustration, α-pinene gas-phase photooxidation simulation in the
presence of NOx with G/P partitioning is analyzed and compared
against chamber experiments upon which the simulations were based. We will
assume a perfect calibration where we assume flawless knowledge of the bond
abundance to isolate biases due to measured and unmeasured carbon types. Such
a scenario is obviously not physically achievable, but it serves as a convenient
reference by which we can proceed with a meaningful model–measurement
comparison.
Methods
After describing our data set in Sect. , we introduce a
few relationships among FG, atomic composition, and carbon types in
Sect. . We then describe how we can estimate whether a
particular carbon type is detected by FT-IR based on the set of FG
calibrations used and properties that we calculate as a result in
Sect. . We then present methods for actually
estimating the number of polyfunctional carbon atoms from FG abundance to
minimize multiple counting in Sect. . The code and
software used in this and previous papers are made available under the
GNU Public License (Appendix ).
Data set
We focus this analysis on a specific simulation scenario of
in which comparison of model results to reference
measurements had the smallest discrepancy according to relative molar
abundance of FGs, until model–measurement agreement diverged on what was
attributed to the role of heterogeneous chemistry and aging not implemented
in the model. To briefly describe the simulation, the MCMv3.2 gas-phase
chemistry module generated by the Kinetic Pre-Processor was coupled with a gas/particle organic absorptive
partitioning scheme via operator splitting . The SIMPOL.1
group contribution model was used to estimate the
equilibrium vapor pressure for individual molecules, and the dynamics of mass
transfer to a monodisperse particle population were simulated using LSODE (Livermore Solver for Ordinary Differential Equations;
). Wall losses of particles and semivolatile
volatile organic compounds (SVOCs) were neglected. The scenario we further
analyze for this study was defined by initial α-pinene and NOx
concentrations of 300 and 240 ppb, respectively. The relative humidity was
fixed at 61 %, which influenced the rate of HO2 radical self
reaction to form hydrogen peroxide, but water uptake and influence on G/P
partitioning was not considered. The light intensity was fixed
to be consistent with experimental conditions. This
scenario was labeled the “APIN-lNOx” simulation. In this work, we will
refer to this as the APIN simulation, as we discuss none of the other
scenarios and thus eliminate the need for an additional modifier to the
label. To focus on a particular mixture, we select a reference period as the
apex in SOA concentration occurring at 9.3 h (labeled as
tmaxSOA) of the 22 h simulation as used by
to examine molecular contributions to overall SOA mass
and FG abundance. With detailed knowledge of molecular structure and
composition in this simulation, we apply the analysis described in
Sect. –.
The conditions for the simulations described above were selected to mimic
chamber experiments in which FG composition was measured by .
collected particles between 86 and 343 nm onto
(infrared-transparent) zinc selenide crystals by impaction, and samples were
analyzed immediately afterward to minimize storage artifacts. Samples were
scanned rapidly to minimize evaporative losses in the FT-IR sample
compartment. report that repeated analysis of the same
samples by FT-IR yielded consistent results, suggesting robustness in
reported values. Samples collected during 3.1–4.2 and 17.6–21.6 h (which
we label as “4 h” and “21 h”, respectively) were selected by
for comparison against model simulation for the
corresponding periods, and we will follow this convention here.
Only relative metrics are used as reported measurements in
mole fractions of FGs, and the simulations do not include wall losses of
particles and SVOCs that affect overall estimates of yield. Neglecting
compound-specific SVOC deposition to walls may further incur biases in
relative compositions as raised by , but for this
conceptual study we neglect its effect as its parameters are not precisely
known.
Definitions
The molar abundance of molecules nmolec=[ni] in a
mixture (consisting of a set of molecules denoted by M) can be
related to FG abundance ngroup=[nj] (for each FG in
J) obtained by FT-IR – or other means – by invoking a group
composition matrix X=[xij], which describes the FG makeup of
each molecule. Using scalar notation, we write
nj=∑i∈Mnixij∀j∈J.
nj is the observed quantity from measurement and represents the sum of FG
composition of molecules weighted by their molar abundance.
A statement of atom balance is enabled by the group-atom matrix
Λ=[λaj] by relating
nj to the atomic abundance natom=[na] in the
mixture:
na=∑j∈Jλajnj.
However, the fact that the same polyfunctional carbon atom can be associated
with several FGs poses challenges for reasoning out λC,j
for carbon. Therefore, we introduce a carbon type matrix
Y=[yik] that enumerates the composition of each molecule in
terms of specific number of carbon types, and a carbon-group matrix
Θ=[θkj] that
relates each carbon type to its unique structure of functionalization.
A statement of FG balance can be constructed from the carbon type matrix,
carbon-group matrix, and group composition matrix:
∑k∈Cyikθkj=xij∀i∈M,j∈J.
Conversely, a statement of carbon type balance can be made by introducing a
matrix, Φ=[ϕjk], from which carbon type abundance can
be obtained with FG abundance to construct a statement of carbon type
balance:
yik=∑j∈Jxijϕjk∀i∈M,j∈J.
A minimal illustration for two simple molecules, ethane and ethanol, is
shown in Fig. . Symbols are tabulated in Table
. Explanation of additional arrays Λ
(atom-group matrix), ζ (carbon oxidation state vector), and
z (oxidation state contribution vector) completing the atom and
oxidation state balance follow below. In contrast to concise expressions used
in Fig. , we continue with use of scalar notation below to
more conveniently invoke element-wise, row-wise, and column-wise summations,
but we will return to array notation for describing solutions to system of
equations (Sect. ).
Average number of atoms attached to each type of bond assumed for various types of
mixtures. λC,COOH=λC,carbonyl=1.
Study
Mixture type
λC,CH
λC,COH
λC,CONO2
ambient
0.5
1
ambient
0.5
1
indoor/ambient
0.48
α-pinene SOA
0.63
0.63
0.63
guaiacol SOA
0.88
0.88
0.88
Several*
ambient
0.5
0.5
0.25
ambient
0.5
0
* Reflects assumptions by ,
, and .
Illustration of carbon type and FG relationships for ethane and
ethanol. The FG composition matrix (X), carbon type matrix
(Y), and atom composition matrix (A) describe
properties of the compounds, and the remaining arrays – oxidation
state contribution vector (z), carbon–FG matrix (Θ), FG–carbon matrix
(Φ), atom–FG matrix (Λ), and carbon oxidation state vector
(ζ) – establish their inter-relationships.
In our APIN mechanism, there are 327 molecules, 22 FGs, and 41 carbon types
(Fig. ), though several are associated with radical
structures or unusual structures that are not found in the most abundant
compounds. These do not contribute to the organic aerosol mass, but they are
included for a complete description of the APIN mechanism. Furthermore, while
the equalities introduced in Fig. are formulated to hold at
the level of individual molecules, we demonstrate their application in
describing the underlying relationships in molecular mixtures.
Visualization of the carbon type matrix Θ for
the APIN mechanism. Radical groups are denoted with (*). Carbon types and FGs
are ordered by their aerosol abundance (in decreasing order) in the APIN
simulation at tmaxSOA (Sect. ) with
each value of OSC and z, respectively. The numeric label for
carbon types indicates the overall rank (without regard for its
OSC) in the APIN simulation at tmaxSOA.
Formaldehyde and formic acid are subclasses of aldehyde and COOH,
respectively,
but are defined separately to fulfill the conditions described in
Supplement Sect. S1.
Further details regarding the FG definitions are provided by
.
FGs belonging to measured subset J*= Set1 (Sect. )
is colored in red; additional FGs belonging to Set2 and Full are colored in blue and green,
respectively. Corresponding carbon atoms C* that are associated with
(i.e., detectable by) J* are shown in the same colors.
The carbon type matrix provides a conceptual relationship for relating FGs to
number of carbon atoms in a mixture (Eq. for carbon is also
restated on the right-hand side),
nC=∑i∈M∑k∈Cniyik=∑i∈M∑j∈JniλC,jxij,
and we can see from Eqs. () and () that
λC,j is equivalent to the column-wise summation of
ϕjk:
λC,j=∑k∈Cϕjk∀j∈J.
Previous values for λC are shown in
Table . The atomic abundance for each carbon type k is
calculated as nka=∑j∈Jλajθkj, as
follows from Eqs. () and ().
The mean carbon oxidation state can be estimated from (1) yik through
the oxidation state ζ=[ζk] specific to carbon type,
and (2) xij and individual FG contributions z=[zj] to carbon
oxidation state:
OS‾C=1nC∑i∈M∑k∈Cniyikζk=1nC∑i∈M∑j∈Jnixijzj
From Eq. (), we can see that ζk and zj are
related through the following equality:
ζk=∑j∈Jθkjzj∀k∈C
All elements in Eq. can be known precisely for any
set of molecules M from the chemometric patterns and atom-level validation described by , which are summarized in Sect. S1.
Furthermore, the FGs included in the APIN system are all those which are
defined by association only to single carbon atoms (e.g., alcohol,
carboxylic, methylene groups). Methods for extending this analysis to FGs
containing multiple carbon atoms (e.g., anhydride, ester, and organic
peroxide groups) are described in Sect. S2. Solution methods for ϕjk
and λC,j are presented in Sect. .
Theoretical mass recovery and estimated properties
This section describes methods for determining whether the carbon type is detected
by FT-IR and how relationships introduced in Sect. can
be modified for a more direct comparison with measurements. The main idea is
to consider only the subset of carbon atoms which is bonded to any of the FGs
measured in a given experiment and to analyze properties only for those carbon
atoms as to what is the achievable degree of characterization of the SOA.
Given a set of FG which are measured J*⊆J
and the corresponding subset of carbon atoms C*⊆C which only contain these FGs, we can estimate the number of
carbon atoms measured from a modification of Eq. ():
nC*=∑i∈M∑k∈C*niyik=∑i∈M∑k∈Cniyik⋅sgn∑j∈J*θkj.
sgn is the signum function, which will return 0 when its
argument is 0 (no FGs associated with carbon type k are in the measured
set) and 1 when its argument is positive (one or more FGs belong to the
measured set). The total carbon recovery is calculated as nC*/nC.
We consider three sets of FGs for J*. Set1 = {aCH, aCOH,
COOH, ketone and aldehyde carbonyl, CONO2} and comprises FGs
reported by and many others e.g.,. Set2 = Set1 + {eCH, hydroperoxide,
peroxyacyl nitrate}, and comprises Set1 and three additional FGs that are
not commonly reported for OM characterization but have medium to strong
absorption bands in the mid-infrared wavelengths
(Appendix ) (not inclusive) and relevant for this system.
The set labeled as “Full” comprises all groups present in OM, including
quaternary and tertiary sp2 carbon (carbon atoms that are only bonded to
other carbon atoms) that account for 7 % of the mass in the APIN
simulation at tmaxSOA, and also the remaining groups
(Fig. ) that accounts for < 1 % of the
remaining mass.
We can estimate OM as the sum of elements multiplied by their respective
molecular weights using Eq. (). Atomic ratios are calculated
as na/nC for all heteroatoms a={H,N,O} (S is not included in this chemical mechanism, but this
principle can be extended for mechanisms that include it):
na*=∑j∈J*λajnj.
Atomic ratios are calculated as na*/nC*.
To estimate the mean carbon oxidation state, we can replace nC
with nC* and sum over J* instead of J
in Eq.() by corollary with Eq. ():
OS‾C≈1nC*∑j∈J*zjnj.
Estimation of carbon abundance
In this section, we describe methods for estimating nC from
measured abundance of FGs. The main objective is to arrive at a set of
coefficients λ^C that, when multiplied by FG abundance
nj for measured FGs J*, provides an estimate
n^C* that does not count multiples of the same carbon atoms
which are attached to the suite of FGs analyzed:
n^C*=∑j∈J*λ^C,jnj.
The use of the hat over a symbol denotes a statistically estimated quantity.
It is convenient to continue discussion of solutions to a system of equations
in array notation (similar to what is used in Fig. ). Let
Y=[niyik], X=[nixij],
Θ=[θkj], Φ=[ϕjk],
λC=[∑k∈Cϕjk], and
nC=[∑k∈Cyik]. The FGs and carbon
type abundances can be written as YΘ=X. The most obvious solution is to take the generalized or
Moore–Penrose inverse, Φ^=Θ+. In
the example illustrated in Fig. , the solution to
Φ=Θ-1 and
λC (a row of ΛT) using
such an approach is provided. The elements of Φ satisfy the
carbon type balance (Eq. ) but are not required to be
non-negative, but their summation across rows (Eq. ) yields
values for λC that corresponds to the number
of carbon atoms per FG associated with them. While exact solutions can be
found for this illustration because Θ is square (i.e.,
the number of carbon types equals the number of types of FGs), the
pseudo-inverse solution will not be meaningful in a more general case as the
number of ways in which FGs are arranged on carbon atoms exceeds the number
of measured FG used for discrimination. λC may
also not correspond to a physically interpretable quantity in such instances,
as a single set of coefficients is insufficient to estimate the exact
abundances of carbon atoms under these circumstances.
Therefore, while carbon types are a useful concept to describe the underlying
representation of functionalized organic compounds, it is generally not
possible to retrieve the exact abundance of each carbon type from FG
measurements. To arrive at an approximate solution for estimation of the
total carbon atoms without discrimination of carbon types, we consider the
three approaches described below.
First, we consider each carbon type in isolation (“COUNT” method) and
average the reciprocal of measured FGs per carbon enumerated for each carbon
type:
λ^C,j=1|Cj|∑k∈Cj1∑j′∈J*θkj′.
|⋅| denotes the cardinality of (i.e., number of elements in) a set and
Cj is the set of carbon types in which FG j appears, and is
the origin of the dependence of λC on j. The main premise
of this approach is to apportion fractional units of carbon to each measured
FG such that their sum equals unity. The rationale can be supported by the
illustration (Fig. ) in which 1/3 for
λC reflects the number of measured FGs
attached to each carbon atom.
In the second approach (“COMPOUND” method), we find Φ
that corresponds to the least squares solution to the following equation:
Y^=XΦ^.
λ^C is found by row-wise summation of
Φ^ (Eq. ) (which is also equivalent to
solving for λ^C directly in the reduced
expression, n^C=Xλ^C). Given the wide range
of possibilities in composition, we set molar abundances to unity such that
each compound within each group (SVOC) is uniformly weighted. We average over
carbon types present in molecules relevant to certain mixture classes with
uniform weighting such that the derived coefficients are not overly specific
to any particular mixture.
In the third approach (“MIXTURE” method), we reformulate Y=[nmiyij] and X=[nmixij] such that its rows
contain the FG abundance of the mixture of each time step tm of the APIN
simulation, and λC is found by fitting
X to nC*, the time series of carbon atom
concentration in the condensed phase at each time step. For MIXTURE, we use a
constrained least squares approach where the values of the regression
coefficients are bounded between 0 and 1 as the coefficients for FGs with low
abundance (e.g., eCH and CONO2) are not well constrained (the
solution is insensitive to their values).
Numerical details aside, the main differences among the three are the data
sets used for estimation. COUNT uses information from Θ
only (defined for the FGs in the APIN mechanism), COMPOUND uses carbon type
abundances in compounds (limited to SVOCs in the APIN mechanism), and MIXTURE
uses mixture information of the condensed phase (from different periods in
the APIN simulation). The resulting differences in estimates of
λ^C are largely due to weighting of FGs associated
with each carbon type: each type receiving equal weight (COUNT), by frequency
of occurrence in SVOCs (COMPOUND), and by abundance in SOA formed in the APIN
simulation (MIXTURE). While the COUNT method is physically significant at the
level of individual carbon atoms, the representativeness of estimated values
for use in mixtures can vary according to composition. Direct fitting
methods, on the other hand, may lead to insignificant coefficients from
under-represented or redundant FGs, or be overly specific such that they
cannot be generalized to other systems. Therefore, the results from all three
methods are evaluated to explore the range of plausible values.
Each of the solutions produces a series of irrational numbers (due to the
multiplicitous configurations of FGs on carbon atoms) that may be overly
precise for the data set used for estimation. As later shown, we will also
adjust the COUNT solutions to rational values of {1/4, 1/3, 1/2, 1}
(with exception for λC,aCH which we fix to a value
of 0.45 as explained in Sect. ), and we will refer
to this as the “NOMINAL” solution. For the COMPOUND and MIXTURE methods,
FGs and carbon types with a unique (one-to-one) correspondence (e.g., carbon
atoms associated with carboxylic acid and ketonic and aldehydic carbonyl
groups) are excluded from the fitting, as their coefficients are known
unambiguously. Evaluations of estimates are expressed as a ratio of the
estimate over the reference value: n^C*/nC*. We
remark that we focus on harvesting information from the APIN simulation
results only, but these methods can (and should) be applied to study
abundances in molecular speciation data from chamber experiments under
different oxidation and environmental conditions e.g., in future work.
Time series of carbon type abundances for the APIN simulation
described in Sect. . The carbon types are defined in
Fig. .
Results
We first describe the APIN simulation results of recast
in terms of abundance of carbon types in Sect. .
We then describe mass recovery and biases in property estimates due solely to
unmeasured carbon atoms in Sect. . In
Sect. , we describe results from applying different
methods for estimating carbon abundance from measured FGs. Finally, in
Sect. , we present estimates of properties from
FG measurements and compare to model simulations.
Evolution of carbon types
The time series of carbon type abundance is shown by its contribution
fraction for each time period in Fig. , and the carbon
type composition of the most abundant molecules at tmaxSOA
is depicted in Fig. . Descriptions for the carbon
types found in tmaxSOA are shown in
Fig. . We observe that changes in carbon type
composition is rapid within the first four hours, but generally changes much
more slowly after this period. Many of the dominant carbon types are
generally similar between the gas and aerosol phases and include: methyl
(CH3), methylene (CH2), ketone, primary alcohol, and
secondary alcohols, acid (COOH), hydroperoxides, and peroxyacyl nitrate
groups. However, the order of abundance is different between phases – for
instance, the peroxyacyl nitrate is more abundant in the gas phase (carbon
type 10; Fig. ). As visualized in
Fig. and described by , the
molecular abundance is dominated by a small number of polyfunctional
compounds (out of the [200] compounds in the mechanism), so their carbon
types are weighted heavily in the overall carbon type composition.
Compound and carbon type abundance for APIN simulation at
tmaxSOA. C97OOH and C98OOH are large, polyfunctional
compounds containing ketone and hydroperoxide groups. The carbon types are
defined in Fig. .
Cumulative carbon fraction for APIN simulation at
tmaxSOA. Colors show carbon atoms measurable by different sets
of FGs (Sect. ). The carbon types are defined
in Fig. .
Values for λC with standard errors in parentheses
where available
(uncertainties were not calculated for the constrained optimization algorithm in the MIXTURE estimation method).
Values for λC,COOH=λC,carbonyl=1 are fixed and therefore not
included in the table.
Set
Method
aCH
aCOH
CONO2
eCH
hydroperoxide
Set1
COUNT
0.39 (0.04)
0.52 (0.17)
0.52 (0.17)
Set1
COMPOUND
0.47 (0.01)
0.31 (0.06)
0.64 (0.11)
Set1
MIXTURE
0.45
0.09
1.00
Set1
NOMINAL
0.45
0.50
0.50
Set2
COUNT
0.39 (0.04)
0.52 (0.17)
0.52 (0.17)
0.75 (0.25)
0.52 (0.17)
Set2
COMPOUND
0.48 (0.01)
0.26 (0.05)
0.54 (0.09)
1.08 (0.20)
0.35 (0.07)
Set2
MIXTURE
0.50
0.16
0.41
1.00
0.00
Set2
NOMINAL
0.45
0.50
0.50
1.00
0.50
SOA properties for APIN simulation at tmaxSOA.
Atomic ratios (na*/nC*) shown
in panels (a)–(c) are in molar units, and OM / OC
ratios shown in panel (d) are in mass units. The abundance of carbon
used for normalization is defined by the detectable carbon for each set of
FGs (Sect. ), which can lead to estimated
ratios with Set1 or Set2 to exceed the Full case.
Theoretical mass recovery and property estimation
The ordered contribution to mass recoveries of OC and OM for the most
dominant carbon types at tmaxSOA are displayed in
Fig. . Greater than 99.9 % of the OC and OM mass is
accounted for by 15 carbon types during this period, while more than 20
compounds are required to reconstruct aerosol OC mass with > 99.9 %
recovery (Fig. ). Mass recovery with Set1 is on
the order of 80 %. The fraction of OC estimated by FT-IR relative to OC
measured by thermal optical methods are often within a similar range
e.g.,. With additional bonds in Set2,
93 % carbon recovery is achieved. The unmeasured carbon types are
quaternary and tertiary sp2 carbon that are bonded to C-bonds only, and
together comprise 7 % of the OC (Full case).
Going from Set1 to Set2, the increase in fraction of recovered OM is greater
than recovered OC because of the hydroperoxide and peroxyacyl nitrate mass is
much greater than the mass of carbon bearing these FGs. The resulting effect
on estimated properties is shown in Fig. . H / C recovery
is high for Set1 already, but we are missing the oxygen from hydroperoxide
and peroxyacyl nitrate. eCH is small. N / C is very small (low-NOx conditions). OM / OC can be off by 0.2. Even with nearly full
mass recovery, ratios are often inflated by a small amount on account of the
unmeasured carbon (i.e., nC*≤nC).
Distribution of carbon oxidation states and their ensemble estimate
APIN simulation at tmaxSOA. Panel (a) shows
distribution and measurable carbon atoms with same color scheme
. Panel (b) shows various estimates of
OSC (b) for the mixture using different FG sets
(Sect. ). 2O / C–H / C is a common
approximation used by elemental analysis and is included for
reference.
The carbon oxidation state distribution and recoverable portions for
tmaxSOA are shown in Fig. a. This figure
visually reinforces the abundance of methyl carbons (CH3,
OSC=-3) and methylene carbons (CH2, OSC=-2)
discussed above, though there are other carbon types contributing to the
OSC=-2 category (Fig. ). The unmeasurable
carbon types with FT-IR are those with OSC = 0, which are the
quaternary and tertiary sp2 carbon (carbon types which are measurable in
the OSC= 0 category have a balance of negative and positive
values from aCH and electronegative heteroatoms). The value of the additional
FGs in Set2 are for characterization of oxidizing FGs (hydroperoxide and
peroxyacyl nitrate) that on carbon atoms with overall oxidation states of 1
and 3. Estimates of the mean OS‾C is shown in
Fig. , panel b. We can see that the bias in estimation for
neglecting hydroperoxide and peroxyacyl nitrate is not as great as for the
O / C ratio, since the OSC is determined by the atom and bond
connected to the carbon atom directly, and the rest of the multiple oxygen
atoms in the FG are not considered. The 2O / C–H / C estimate
commonly used with elemental analysis will lead to a slight overestimation of
the OS‾C in the event that oxygen
single-bonded to carbon (hydroxyl and hydroperoxide groups) exist in large
abundance proportionally to double-bonded carbonyl groups .
Estimation of carbon abundance
Table summarizes the new values for
λ^C obtained by the different estimation methods
described in Sect. . Comparison of
n^C* estimated using these values against nC*
in individual compounds is shown in Fig. , and the
comparison of n^C* and nC* in overall aerosol
mixtures at different time periods in the APIN simulation is shown in
Fig. .
Comparison of estimated (n^C*) and actual
(nC*) number of measurable carbon atoms in different SVOC
compounds (colored by their compound-averaged oxidation states,
OS‾C) using estimates of
λ^C for various FG sets and solution methods.
The diagonal line is the x=y line provided for visual reference. The ratio
is defined as n^C*/nC* and estimated as the
slope (not drawn) of n^C* regressed on nC*.
r is the Pearson's correlation coefficient.
Ratios of estimated (n^C*) and actual
(nC*) number of measurable carbon atoms in the APIN simulated
aerosol mixture using estimates of λ^C for various FG
sets and solution methods. The gray horizontal line corresponds to y=1.0
(perfect estimate).
Comparison of measurement (MEAS) and simulations (SIM) for samples
ending approximately at 4 and 21 h (time-integrated over 3.1 to 4.2 and
17.6 and 21.6 h, respectively) after initiation of photochemistry
. Further details on labels for estimates are
defined in Sect. . Colors for (b) are
the same as for Fig. , except that ketone and aldehyde has
been combined into a single color (teal) because the reported measurements do
not differentiate between the two types of carbonyl.
Values for λ^C are roughly similar among estimation
methods, with the exception of the MIXTURE estimate. Overall, we find that the
coefficient for aCH is close to but less than the often assumed value of 0.5
(Table ), which can play an important role on account of the
abundance of aCH bonds and carbon types associated with aCH. For the MIXTURE
estimate, λ^C,aCH=0.5 but is balanced by
exceptionally small coefficients for aCOH and hydroperoxide. This combination
of coefficients essentially downweights the contributions from carbon types
associated with aCH and hydroperoxide, which we know to be present in
abundance (within top 6 for the APIN simulation at tmaxSOA,
but remains significant throughout the simulation as seen in
Fig. ). Therefore, we conclude that the estimates
obtained for this fit are statistically convenient but less physically
relevant than the other estimates. For the NOMINAL case, we fix the aCH to
λC,aCH=0.45 and the rest to the nearest rational
numbers.
For individual compounds, we note that using either Set1 and Set2 reproduce
nC* with similar biases on average: 11 % for COUNT and
within 4 % for the others. COUNT underestimates nC* in large
compounds with lower oxidation states containing many aCH groups, because of
the low estimate of λ^C,aCH. COMPOUND
reproduces nC* well because this is the data set COMPOUND was
fit to, but MIXTURE also does well. The NOMINAL solution also does well, but
largely owing to the λC,aCH adjustment.
For reproducing mixture composition, trends in biases are similar to
individual compounds, with underestimation by as much as 18 % for COUNT
and within 7 % for the other estimation methods. MIXTURE performs the
best because this is the data set it was fitted to, but we see that the
COMPOUND and NOMINAL are also acceptable. There is generally a trend toward
increasing n^C*/nC* over the duration of the
simulation, which indicates an evolving relationship between FGs and carbon
abundance with mixture composition. Time-dependent (i.e., mixture-specific)
estimates of λC may be warranted when the change in
composition becomes more significant.
We therefore conclude that errors for estimation of nC* can be
quite low and are well below 10 % according to our evaluation. Even a
10 % error in estimation of nC* will lead to a 9 % error
in the estimation of any individual atomic ratio, and 5 % estimation in
the OM / OC ratio (Appendix ). Therefore,
in applying the NOMINAL coefficients to measured values of FGs under
conditions upon which the APIN simulations were based
(Sect. ), we discuss deterministic explanations
for model–measurement discrepancies with less consideration toward statistical
estimation error of nC*.
Comparison with measurements
In this section, we discuss O / C, OM / OC, and
OS‾C estimated from measurements ending at
hours 4 and 21 and APIN simulation results integrated over the same periods
(Fig. ). We label the interpretation of measurements with
previous estimates of λC (Table ) as
“MEAS-PREV”, measurements with revised estimates of λC
(Table ) as “MEAS-NOM”, simulation results using FGs from
Set1 as “SIM-SET1”, and full simulation results as “SIM-FULL”; further
adjustments are made for the last three estimates as justified next. In
Sect. , we presented an estimate of mass
recovery (nC*/nC*) and how this led to
biased estimates of atomic ratios and OM / OC ratio. In
Sect. , we also showed that we can derive estimates
of λC such that errors in estimation of nC*
was small (i.e., n^C*/nC* near unity).
Therefore, for the following comparisons, we neglect the latter error and
correct biases due to carbon mass recovery by using our best estimate of
nC*, rather than nC*, as the
normalization factor. The proportion of detected carbon to make this
correction is obtained from SIM-SET1, in which the same FGs as measurements
are used. While the adjustment is only approximate on account of differences
in the real experimental system and model simulation, it reduces systematic
biases in carbon-centric metrics as described in
Sect. such that deviations from true ratios
can be largely attributed to the unmeasured heteroatoms. For MEAS-NOM, the
atomic ratio is then estimated as
na*/nC*=na*/nC*×(nC*/nC*)SIM-SET1 and the
OM / OC and OS‾C by similar adjustment.
MEAS-PREV remains unadjusted to be used as a reference estimated without
prior knowledge about the underlying molecular structures of the SOA
products.
First, we remark on differences for estimated metrics from two sets of
coefficients applied to the same FG measurements. MEAS-PREV overestimates the
nC* compared to MEAS-NOM by 21–28 % on account of higher
λC coefficients used in the former. However, the
uncorrected bias due to lower mass recovery of carbon is approximately the
same magnitude, and ultimately leads to ratioed values (O / C, H / C,
OM / OC, OS‾C) similar to MEAS-NOM. While
it is not clear that λC derived in this work accurately
represents the true mixture, we posit that the degree of functionalization
characterized by the new estimate is likely to be more representative for the
product mixture after successive oxidation of the APIN, rather than APIN
itself (as assumed by MEAS-PREV). report O / C and
H / C estimates from FT-IR using coefficients of MEAS-PREV and found that
they were within range of aerosol mass spectrometer (AMS) values; this is possibly due to the offsetting
of errors as demonstrated here. In further discussion, we will discuss the
interpretation of observations based on MEAS-NOM.
MEAS-NOM and SIM-SET1 are the two estimates intended to provide the most
direct comparison between experiment and numerical simulation. While the
discrepancy in carbonyl and carboxyl groups at 4 h is only 2 and 3 % in
mole fraction, respectively , this leads to an overall
discrepancy of 0.16 for O / C and 0.2 for OM / OC. Since aCOH,
carbonyl, and COOH groups are a larger contributor to the mass relative to
the aCH group, discrepancies in molar abundance of oxygenated FGs are
magnified when represented in OM / OC ratios and can have a
non-negligible influence on interpretation of mass yields. After 21 h, the
difference is 0.38 in O / C and 0.48 in OM / OC.
attributed the apparent divergence to mechanisms not included in the model.
Oligomerization was not considered a likely candidate as this process not
expected to contribute to increased oxygenation reported by FT-IR.
Condensed-phase photolysis can lead to conversion of hydroperoxides to
carbonyls (some of which are lost to the vapor phase as more volatile
molecules) , but even a hypothetical full molar conversion
is insufficient to explain the model–measurement differences in carbonyl
groups . Other missing mechanisms may include
autoxidation , which can produce extremely low volatility
(ELVOC; ) or highly oxygenated molecules (HOM;
) in the gas phase, or radical reactions in the
condensed phase that lead to highly oxidized products
containing these measured FGs. In these comparisons, we cannot rule out that
some biases in measurement may originate from molar absorption coefficients
estimated for each FG in FT-IR. The absorption intensity is determined by a
change in the magnitude of the dipole moment and can vary according to
molecule or mixture environment; the representativeness of applied absorption
coefficients in these SOA mixtures is a possible area for future inquiry.
However, cite variations on the order of 20 % for
oxygenated FGs in several carboxylic acid and ketone species, which provide
some constraints on this uncertainty for the range of compound classes
evaluated in their study.
As reported by , SIM-FULL has similar O / C of
observations in similar chamber studies where aerosol mass spectrometer (AMS)
measurements were available . OM in MEAS-NOM is
less functionalized than in SIM-FULL at hour 4, but the opposite is true at
hour 21 even while hydroperoxide and peroxyacyl nitrate is not included. The
rate of transformation of these FGs remains uncertain – for instance,
reported lifetimes of hydroperoxides range from less than an hour to many
days ; resolving their reaction pathways may
play a critical role in understanding model–measurement discrepancies
. Using the estimates of MEAS-NOM, the additional oxidation
and aging process between 4 and 21 hours leads to an increase in O / C of
about 0.24, including a 0.09 difference in O / C from carbonyl (a product
of hydroperoxide photolysis). If we extrapolate the O / C of MEAS-NOM to
that which includes hydroperoxide and peroxyacyl nitrate groups by assuming
the same hydroperoxide and peroxyacyl nitrate contributions from SIM-FULL, we
would obtain an overall O / C ratio of 0.7 at hour 4 and 0.9 at hour 21.
The latter value is at the higher end of O / C values by reported by AMS
e.g.,. A
concurrent measurement of overall O / C and O / C partitioned by
measured FG may provide better constraints on our understanding of OM
transformations.
As with O / C and OM / OC, OS‾C also
highlights the greater extent of functionalization in observations than in
simulations between hours 4 and 21. OS‾C
estimated from MEAS-NOM is in the range of low-volatility oxygenated organic
aerosol (LV-OOA) , while they are in the range of
semi-volatile oxygenated organic aerosol (SV-OOA) in the simulations as
consistent with the species included in the MCMv3.2 mechanism. In simulation,
the products found in the aerosol phase are contain more than six carbon
atoms, and the smaller, highly oxidized molecules remain in the gas phase
(Sect. S3, Fig. S1 in the Supplement). As discussed in
Sect. and shown in comparison between
SIM-MEAS1 and SIM-FULL (Fig. c), the missing contributions from
hydroperoxide and peroxyacyl to OS‾C are
likely to be small as only the valence of the bonded atoms, and not the total
atomic count of the FGs, contributes to the carbon oxidation state.
Conclusions
This study extends the work of and
to demonstrate how molecular structure – specifically, functionalization –
can inform comparisons between model and measurement through knowledge of the
underlying carbon type abundances. For a measured subset of molar FG
abundances, we estimate the expected mass recovery of simulated OC and OM,
and how this impacts reported properties such as atomic ratios (O / C,
H / C) and OM / OC mass ratios that are of interest to the
atmospheric aerosol community. Furthermore, we show how information regarding
the underlying molecular structure can be used to better constrain the
abundance of polyfunctional carbon that can be estimated from measurements of
FGs.
For the α-pinene photooxidation simulation analyzed, we find that
80 % of the carbon is detectable by the set of commonly measured FGs, and
7 % is unmeasurable on account of having only carbon–carbon bonds. The
problem of multiply enumerating polyfunctional carbon atoms using FG
abundances for types in this simulated mixture introduces a smaller error,
typically less than 10 %. The coefficients required to map FG abundance
to carbon abundance varies slightly from what has been assumed for ambient
samples; until more studies are conducted there may be reason to continue
using previous coefficients for consistency. Comparison of simulation results
to measured O / C, OM / OC, and carbon oxidation state partitioned by
FG contributions elucidated the magnitude of missing LV-OOA (among other
classes of molecules) in our model on these widely use metrics. Our current
model only includes gas-phase chemistry prescribed by MCMv3.2 combined with
gas–particle partitioning at present time, but such comparisons can be
extended as additional mechanisms are added. Within the context of this
framework, the value of improving our knowledge of SOA formation and aging,
investigating measurement artifacts, and developing calibration models for
additional FGs for improved comparison with models can be better evaluated.
Since FG analysis measures characteristics of carbon types present in
molecules of complex SOA mixtures, it can bridge our understanding of the
atomic composition (e.g., measured via AMS) and constituent molecules
identified by the growing number of emerging analytical methods
e.g., to place their contributions in
perspective. With regards to numerical simulation, model–measurement
integration using FGs can further guide development of chemical mechanism
generators e.g., and detailed
benchmark models e.g.,, upon which reduced chemical
reaction schemes are based e.g.,. We anticipate that
the work expounded in this series of papers will strengthen the ensemble
of tools available to study the complex phenomena of organic aerosol
formation and aging.