Determination of Primary combustion source organic carbon-to-elemental carbon ( OC / EC ) ratio using ambient 2 OC and EC measurements : Secondary OC-EC correlation minimization method 4

Elemental carbon (EC), due to its exclusive origin in primary combustion sources, has been 16 widely used as a tracer to track the portion of co-emitted primary organic carbon (OC) and, by extension, to estimate secondary OC (SOC) from ambient observations of EC and OC. 18 Key to this EC tracer method is to determine an appropriate OC/EC ratio that represents primary combustion emission sources (i.e., (OC/EC)pri) at the observation site. The 20 conventional approaches include regressing OC against EC within a fixed percentile of the lowest (OC/EC) ratio data (usually 5-20%) or relying on a subset of sampling days with low 22 photochemical activity and dominated by local emissions. The drawback of these approaches is rooted in its empirical nature, i.e., a lack of clear quantitative criteria in the selection of 24 data subsets for the (OC/EC)pri determination. We examine here a method that derives (OC/EC)pri through calculating a hypothetical set of (OC/EC)pri and SOC followed by seeking 26 the minimum of the coefficient of correlation (R) between SOC and EC. The hypothetical (OC/EC)pri that generates the minimum R(SOC,EC) then represents the actual (OC/EC)pri 28 ratio if variations of EC and SOC are independent. This Minimum R Squared (MRS) method Atmos. Chem. Phys. Discuss., doi:10.5194/acp-2015-997, 2016 Manuscript under review for journal Atmos. Chem. Phys. Published: 19 January 2016 c © Author(s) 2016. CC-BY 3.0 License.


Introduction
Organic carbon (OC) and elemental carbon (EC) are among the major components of fine particular matter (PM 2.5 ) (Malm et al., 2004).EC is a product of carbon fuel-based combustion processes and is exclusively associated with primary emissions whereas OC can be from both direct emissions and be formed through secondary pathways.Differentiation between primary organic carbon (POC) and secondary organic carbon (SOC) is indispensable for probing at-Published by Copernicus Publications on behalf of the European Geosciences Union.mospheric aging processes of organic aerosols and formulating effective emission control policies.However, direct SOC measurement is not yet feasible, as there lacks knowledge of its chemical composition at the molecular level.Due to its exclusive origin in primary combustion sources, EC was first proposed by Turpin and Huntzicker (1991) to serve as the tracer to track POC from primary combustion sources and, by extension, to estimate SOC as SOC is simply the difference between OC and POC.This EC tracer method only requires measurements of OC and EC.Due to its simplicity, the EC tracer method has been widely adopted in studies reporting ambient OC and EC measurements (e.g., Castro et al., 1999;Cao et al., 2004;Yu et al., 2004).If OC and EC concentrations are available and primary OC from non-combustion sources (OC non-comb ) is negligible, SOC can be estimated using EC as the tracer for combustion source POC (Turpin and Huntzicker, 1995): where (OC / EC) pri is the OC / EC ratio in freshly emitted combustion aerosols, and OC total and EC are available from ambient measurements.Abbreviations used in this study are summarized in Table 1.
The key step in the EC tracer method is to determine an appropriate OC / EC ratio that represents primary combustion emission sources (i.e., (OC / EC) pri ) at the observation site.Various approaches in deriving (OC / EC) pri reported in the literature are either based on emission inventory (Gray et al., 1986) or ambient observation data.Using ambient observation data, three approaches are the most common: (1) regressing measured OC vs. EC data from times of low photochemical activity and dominated by local emissions; (2) regressing measured OC vs. EC data on a fixed percentile of the lowest OC / EC ratio (usually 5-20 %) data to represent samples dominated by primary emissions (Lim and Turpin, 2002;Lin et al., 2009); and (3) simply taking the minimum OC / EC ratio during the study period to approximate (OC / EC) pri (Castro et al., 1999).Combinations of the fixed percentile and the minimum (OC / EC) pri approaches were also used in order to accommodate different sample sizes available.For example, Pio et al. (2011) suggested using the lowest 5 % subset to obtain the (OC / EC) pri , and if the sample size of 5 % subset is less than three, the lowest three data points are used to determine (OC / EC) pri .These approaches have the drawback in that there is not a clear quantitative criterion in the data selection for the (OC / EC) pri determination.Millet et al. (2005) was the first to propose an algorithm that explores the inherent independency between pollutants from primary emissions (e.g., EC) and products of secondary formation processes (e.g., SOC) to derive the primary ratios (e.g., (OC / EC) pri ) for species with multiple source types.More specifically, for the determination of (OC / EC) pri , the assumed (OC / EC) pri value is varied continuously.At each hypothetical (OC / EC) pri , SOC is calculated for the data set and a correlation coefficient value (R 2 ) of EC vs. SOC (i.e., R 2 (EC,SOC)) is generated.The series of R 2 (EC,SOC) values are then plotted against the assumed (OC / EC) pri values.If variations of EC and SOC are independent, the assumed (OC / EC) pri corresponding to the minimum R 2 (EC,SOC) would then represent the actual (OC / EC) pri ratio.Such an approach obviates the need for an arbitrary selection criterion, as the algorithm seeks the minimum point, which is unique to the data set.However, this method has largely been overlooked, with only one study reporting its use (Hu et al., 2012) since its debut, which may be a result of a lack of evaluation of its method performance.Hereafter for the convenience of discussion, we call this method the minimum R squared (MRS) method.An example illustration of the MRS method is shown in Fig. 1.We have developed a computer program in Igor Pro (WaveMetrics, Inc.Lake Oswego, OR, USA) to facilitate MRS calculation and it is available from https://sites.google.com/site/wuchengust.
With ambient OC and EC samples, the accuracy of estimated SOC by different (OC / EC) pri methods is difficult to evaluate due to the lack of a direct SOC measurement.The objective of this study is to investigate, through numerical simulations, the bias of SOC estimates by three different implementations of the EC tracer method.Hypothetic EC, OC, and (OC / EC) pri data sets with known break-down of POC and SOC values are numerically synthesized, then SOC is estimated and compared with the "true" SOC as defined by the synthetic data sets.As such, bias of SOC estimates using the various implementations of the EC tracer method can be quantified.
2 Evaluation of the minimum R squared method

Data generation
We first examine ambient OC and EC for the purpose of identifying distribution features that can serve as the reference basis for parameterizing the numerical experiments.The 1year hourly EC and OC measurement data from three sites in the PRD (one suburban site in Guangzhou, a general urban site and a roadside site in Hong Kong, with more than 7000 data at each site), are plotted in Fig. S1 in the Supplement document for the whole year data sets and Figs.S2-S4 for the seasonal subsets using the Nancun site as the example.A brief account of the field ECOC analyzers and their field operation is provided in the Supplement.A detailed description of the measurement results and data interpretation for the sites will be given in a separate paper.The distributions of measured OC, EC and OC / EC are fitted by both normal and log-normal distribution curves and then examined by the Kolmogorov-Smirnov (K-S) test.The K-S statistic, D, indicates that log-normal fits all three distributions better than the normal distribution (D values are shown in Figs.S1-S4).Therefore, log-normal distributions are adopted to define the OC, EC and OC / EC distributions during data generation in our numerical experiments.Statistics of these ambient OC and EC, along with a few other measurements reported in the literature, are summarized in Table 2 and are considered as the reference for data generation to better represent the real situation.The probability density function (PDF) for the log-normal distribution of variable x is The two parameters, µ and σ , of the log-normal PDF are related to the average and standard deviation of x through the following equations: First, realistic average and standard deviation values of EC, (OC / EC) pri , and OC (e.g.Figs.S1-S5) are adopted to cal-culate µ and σ .Then the pseudorandom number generator uses µ and σ to synthesize EC and OC data sets.The Mersenne twister (MT) (Matsumoto and Nishimura, 1998), a pseudorandom number generator, is used in data generation.MT is provided as a function in Igor Pro.The system clock is utilized as the initial condition for generation of pseudorandom numbers.The data generated by MT have a very long period of 2 19 937 -1, permitting large data size and ensuring that pseudorandom numbers are statistically independent between each data generation.The latter feature ensures the independent relationship between EC and noncombustion related SOC data.The case with combustionrelated SOC is briefly discussed in Sect.3. MT also allows assigning a log-normal distribution during pseudorandom number generation to constrain the data.For the verification of the log-normality of MT generated data, a series of K-S tests on the generated data for 5000 runs are conducted.As shown in Fig. S6, 94.4 % of runs pass the K-S test.Hence the performance of MT can satisfy the lognormal distributed data generation requirement in this study.In a previous study, Chu (2005)   ambient measurements.The key information utilized in the EC tracer method is the correlation between EC and POC as well as the irrelevance between EC and SOC.The time series information is not needed in EC tracer method, making pseudorandom number generator a good fit for the evaluation purpose.
The procedure of data generation for the single emission source scenario is illustrated in Fig. 2 and implemented by scripts written in Igor Pro.EC is first generated with the following parameters specified: sample size (n), average and relative standard deviation (RSD%) of the whole data set (see Supplement).The EC data set statistically follows a lognormal distribution, while the sequence of each data point is randomly assigned.POC is then calculated by multiplying EC by (OC / EC) pri (Eq.1).For simplicity, (OC / EC) pri is set to be a single value, while an analysis incorporating randomly generated log-normally distributed (OC / EC) pri values can be found in the Supplement, and a brief summary is given in Sect.2.2.SOC data are independently generated in a similar way to that for EC.The sum of POC and SOC then yields the synthesized OC.OC and EC data generated in this way are used to calculate SOC by different implementations of the EC tracer method.The bias of SOC estimation can then be evaluated by comparing the calculated SOC with the "true" SOC values.Data generation for the scenarios with two primary emission sources is similar to the single source scenario and the steps are illustrated in Fig. S7.

Scenario study
Three scenarios are considered.Scenario 1 (S1) considers one single primary emission source.Scenario 2 (S2) considers two correlated primary emission sources, i.e., two sets of EC, POC, and each source has a single but different (OC / EC) pri value.An example of S2 is combined vehicular emissions from diesel-fuel and gasoline-fuel vehicles.These two sources of vehicular emissions have different (OC / EC) pri , but often share a similar temporal variation pattern, making them well correlated.Scenario 3 (S3) considers two independent primary emission sources and simulates an ambient environment influenced by two independent primary emission sources, e.g.local vehicular emissions (lower (OC / EC) pri ) and regional biomass burning (higher (OC / EC) pri ).
In the following numerical experiments, three (OC / EC) pri estimation methods are examined and compared, including MRS, OC / EC 10 % and OC / EC min .As a single point, OC / EC min , in ambient samples may be subjected to large random uncertainties, thus data with the lowest 1 % OC / EC are adopted instead to derive the OC / EC min .

Single primary source scenario
Both OC / EC 10 % and OC / EC min methods rely on a subset of ambient OC and EC data to approximate (OC / EC) pri .Figure 3 provides a conceptual illustration of the relationships between (OC / EC) pri and the ambient OC / EC data, both are described to exhibit a log-normal distribution.As primary emissions move away from sources and aging processes start in the atmosphere, SOC is added to the particle OC fraction, elevating OC / EC above (OC / EC) pri .This in effect broadens the OC / EC distribution curve and shifts the distribution to the right along the OC / EC axis, and the degree of broadening and shift depends on degree of aging process.The conventional EC tracer method using OC / EC 10 % and OC / EC min assumes that the left tail of ambient OC / EC distribution is very close to (OC / EC) pri .This assumption, however, is fortuitous, rather than the norm.Two parameters, the distance between the means of the (OC / EC) pri and ambient OC / EC distributions and the relative breadth of the two distributions, largely determines the closeness of the approximation of OC / EC 10 % and OC / EC min to (OC / EC) pri .The distance between the two distributions depends on the fraction of SOC in OC (i.e., f SOC ), while the width of the ambient OC / EC distribution is closely associated with RSD of SOC (RSD SOC ) and the width of the (OC / EC) pri distribution is reflected in RSD POC and RSD EC .As shown in Fig. 3a, only an appropriate combination of distance of the two distribution means and variances could lead to a close approximation of the (OC / EC) pri by OC / EC 10 % or OC / EC min (i.e., the left tail of OC / EC distribution).If the ambient aerosol has a significant f SOC shifting the ambient OC / EC distri- bution such that its left tail is beyond (OC / EC) pri (Fig. 3b), then the left tail would overestimate (OC / EC) pri .Underestimation of (OC / EC) pri could also happen in theory as shown in Fig. 3c if the ambient minimum OC / EC (left tail) is less than the mean of the (OC / EC) pri distribution (i.e., under conditions of very small f SOC ).
The above analysis reveals that f SOC , RSD SOC , RSD POC , and RSD EC are key parameters in influencing the accuracy of SOC estimation.As a result, they are chosen in the subsequent sensitivity tests in probing the SOC estimate bias under conditions of different carbonaceous aerosol compositions.
SOC estimation bias in S1 as a function of RSD SOC and RSD EC is shown in Fig. 4a and b.The SOC estimate by MRS is not affected by the magnitude of RSD EC and RSD SOC , and is in excellent agreement with the true values (Fig. 4).In comparison, SOC by OC / EC 10 % and OC / EC min is consistently biased lower and the degree of negative bias becomes larger with decreasing RSD SOC or RSD EC .The OC / EC 10 % method always produces larger negative bias than the OC / EC min method.At RSD SOC and RSD EC at 50 %, SOC estimate has a −14 % bias by (OC / EC) min and a −45 % bias by OC / EC 10 % .These results confirm the hypothesis illustrated in the conceptual diagram (Fig. 3) that the validity of using the left tail of OC / EC distribution depends on the distance of its distribution mean from (OC / EC) pri and the distribution breadth.Both OC / EC 10 % and the OC / EC min methods underestimate SOC and the degree of underestimation by the OC / EC 10 % method is worse.
For the representation of (OC / EC) pri in the simulated data as lognormally distributed data, analysis is also performed to evaluate SOC estimation bias as a function of RSD EC , RSD SOC , and f SOC .Table S2 summarizes the results obtained with adopting most probable ambient conditions (i.e., RSD EC : 50-100 %, f SOC : 40-60 %).SOC bias by MRS is within 4 % when measurement uncertainty is ignored.In comparison, SOC bias by OC / EC min is more sensitive to assumption of log-normally distributed (OC / EC) pri than single value (OC / EC) pri , including the dependency on RSD EC and RSD SOC with varied f SOC .

Scenarios assuming two primary sources
In the real atmosphere, multiple combustion sources impacting a site is normal.We next evaluate the performance of the MRS method in scenarios of two primary sources and arbitrarily dictate that the (OC / EC) pri of source 1 is lower than source 2. By varying f EC1 (proportion of source 1 EC to total EC) from test to test, the effect of different mixing ratios of the two sources can be examined.Common configurations in S2 and S3 include the following: EC total = 2 ± 0.4 µgC m −3 ; f EC1 varies from 0 to 100 %; ratio of the two OC / EC pri values (γ _pri) vary in the range of 2-8.
In Scenario 2 (i.e., two correlated primary sources), three factors are examined, including f EC1 , γ _pri and f SOC , to probe their effects on SOC estimation.By varying f EC1 , the effect of different mixing ratios of two sources can be examined, as f EC1 is expected to vary within the same ambient data set as a result of spatiotemporal dynamics of air masses.MRS reports unbiased SOC, irrespective of different f EC1 and f SOC or γ _pri (Fig. 5).In comparison, SOC by OC / EC 10 % and OC / EC min are underestimated.The degree of underestimation depends on f SOC , e.g., −12 % at f SOC = 25 % versus −20 % at f SOC = 40 % in the OC / EC min method while the magnitude of underestimation has a very weak dependence on f SOC in the OC / EC 10 % method, staying around −40 % as f SOC is doubled from 20 to 40 %.The degree of SOC bias by OC / EC 10 % and OC / EC min are independent of f EC1 and γ _pri, as SOC bias is associated with RSD EC , RSD SOC and f SOC .Since two primary sources are well correlated, RSD EC is equivalent between the two sources.As a result, the overall RSD EC is constant when f EC1 and γ _pri vary, and the SOC bias is independent of f EC1 and γ _pri.

Atmos
In summary, in scenarios of two well-correlated primary combustion sources, MRS always produces unbiased SOC estimates while OC / EC min and OC / EC 10 % consistently underestimate SOC, with OC / EC 10 % producing larger negative bias.
As for Scenario 3 in which two independent primary sources co-exist, SOC estimates by MRS could be biased and the degree and direction of bias depends on f EC1 .Figure 6a shows the variation of SOC bias with f EC1 when f SOC is fixed at 40 %.The variation of SOC bias by MRS with f EC1 follows a pseudo-sine curve, exhibiting negative bias when f EC1 < 50 % (i.e., EC is dominated by source 2, the higher (OC / EC) pri source) and positive bias when f EC1 > 50 % and the range of bias are confined to −20 to −40 % under the condition of f SOC = 40 %.In comparison, the OC / EC min and OC / EC 10 % methods again consistently underestimate SOC by more than −50 %, with the bias worsened in the OC / EC 10 % method.
The bias variation range becomes narrower with increasing f SOC in the MRS method, as shown by the boxplots for four f SOC conditions (20, 40, 60, and 80 %) in Fig. 6b.The MRS-derived SOC bias range is reduced from −20 to +40 % at f SOC = 40 % to −10 to +20 % at f SOC = 60%, further to −6 to +10 % at f SOC = 80 %.In the other two methods, the SOC bias does not improve with increasing f SOC .Dependence of the SOC estimation bias on γ _pri is examined in Fig. 6c showing the higher γ _pri induces a higher amplitude of the SOC bias.If OC is dominated by SOC (e.g., f SOC = 80 %), SOC bias by MRS is within 10 %.
A variant of MRS implementation (denoted as MRS ) is examined, with the important difference that EC 1 and EC 2 , attributed to source 1 and source 2, respectively, are used as inputs instead of total EC.With the knowledge of EC breakdown between the two primary sources, (OC / EC) pri1 can be determined by MRS from EC 1 and OC total .Similarly (OC / EC) pri2 can be calculated by MRS from EC 2 and OC total .SOC is then calculated with the following equation: MRS produces unbiased SOC, irrespective of the different carbonaceous compositions (Fig. 6).However, we note that  2 4 6 8 2 4 6 8 2 4 6 (2, 4, 6 and 8).The symbols in the boxplots are empty circles as average, the line inside the box as median, upper and lower boundaries of the box representing the 75th and the 25th percentile, and the whiskers above and below each box representing the 95th and 5th percentile.
there is a great challenge in meeting the data needs of MRS as EC 1 and EC 2 are not available.
In scenario 3, the simulation results imply that three factors are associated with the SOC bias by MRS, including: f EC1 , γ _pri and f SOC .The first factor controls whether SOC bias by MRS is positive or negative.The latter two affect the degree of SOC bias.For high f SOC conditions, the bias could be acceptable.If EC 1 and EC 2 can be differentiated for calculating individual (OC / EC) pri of each source, unbiased SOC estimation is achievable regardless of what values f EC1 , γ _pri and f SOC take.

Impact of measurement uncertainty
In the preceding numerical analysis, the simulated EC and OC are not assigned any measurement uncertainty; however, in reality, every EC and OC measurement is associated with a certain degree of measurement uncertainty.We next examine the influence of OC and EC measurement uncertainty on SOC estimation accuracy by different EC tracer methods.Two uncertainty types are tested, i.e., constant relative uncertainty (Case A); constant absolute uncertainty (Case B).This section mainly focuses on sensitivity tests assuming different degrees of Case A uncertainties. Results assuming Case B uncertainties are discussed in the next section.The uncertainties are assumed to follow a uniform distribution and generated separately by MT.It is also assumed that the uncertainty (ε EC or ε OC ) is proportional to the concentration of EC and OC through the multiplier γ unc (i.e., relative measurement uncertainty).
In order to compare the estimated SOC with simulated SOC with ε SOC , the measurement uncertainties of POC and SOC are then back-calculated following the uncertainty propagation formula and assuming the same relative measurement uncertainty for POC and SOC (Harris, 2010) The simulated EC, POC and SOC with measurement uncertainties (abbreviated as EC simulated , POC simulated and SOC simulated respectively) are determined as Sensitivity tests of SOC estimation as a function of relative measurement uncertainty (γ unc ) and f SOC is performed as shown in Fig. 7 by comparing the estimated SOC with SOC simulated .Fixed input parameters include n = 8000; EC = 2 ± 1 µgC m −3 ; (OC / EC) pri = 0.5.Studies by Chu (2005) and Saylor et al. (2006) suggest that ratio of average POC to average EC (ROA, see Supplement for details) is the best estimator of the expected primary OC / EC ratio because it is mathematically equivalent to the true regression slope when the data contain no intercept.ROA is confirmed as the best representation of (OC / EC) pri for SOC estimation, which shows no bias towards γ unc or f SOC change.MRS overestimates SOC and the positive bias increases with γ unc while decreasing with f SOC (Fig. 7).The SOC estimates by OC / EC min and OC / EC 10 % exhibit larger bias than those by MRS.For example, as shown in Fig. 7a, when f SOC = 20 % and γ unc = 10 %, the bias of SOC by MRS, OC / EC 10 % and OC / EC min is 8, −28 and 36 %, respectively.With increasing f SOC , the bias of SOC by OC / EC min decreases while the bias of SOC by OC / EC 10 % increases when γ unc = 10-20 %.MRS always demonstrates the best performance in SOC determination amongst the three (OC / EC) pri estimation methods.When γ unc could be controlled within 20 %, the SOC bias by MRS does not exceed 23 % when f SOC = 20 % (Fig. 7a).If the f SOC ratio falls in the range of 60-80 % and γ unc is < 20 %, the OC / EC min has a similar performance as MRS, but SOC by OC / EC 10 % still shows a large bias (∼ 41 %) (Fig. 7c and d).
Sensitivity studies of SOC estimation as a function of γ unc and (OC / EC) pri are performed and the results are shown in Fig. S8.In all the three (OC / EC) pri representations, SOC estimates are sensitive to γ unc but insensitive to the magnitude of (OC / EC) pri .In the single primary source scenario (S1), it is proved that the performance of MRS regarding SOC estimation is mainly affected by γ unc and to a lesser degree by f SOC .Other variables such as (OC / EC) pri and EC concentration do not affect the accuracy of SOC estimation.

Impact of sample size
MRS relies on correlations of input variables and it is expected that MRS performance is sensitive to the sample size of input data set.This section examines the sensitivity on sample size by the three (OC / EC) pri representations and aims to provide suggestions for an appropriate sample size when applying MRS on ambient OCEC data.Sample sizes ranging from 20 to 8000 are tested and for each sample size 500 repeat runs are conducted to obtain statistically significant results.Both Case A (i.e., a constant relative uncertainty of 10 %) and Case B (i.e., a constant absolute uncertainty of ±0.2 µgC m −3 for both OC and EC) are considered.The measurement uncertainties in case B are generated separately by MT following a uniform distribution within the range of ±0.2 µgC m −3 .The measurement uncertainties of POC and SOC are then back-calculated following the uncertainty propagation formula (Harris, 2010) and assuming the ratio of ε POC / ε SOC is the same as POC / SOC ratio (controlled by f SOC ).
The mean SOC bias by MRS is very small (< 3 %) for all sample sizes while the standard deviation of SOC bias decreases with sample size (Fig. 8).The standard deviation of SOC bias is ∼ ±30 % at the lowest test sample size (n = 20), and decreases to less than ±15 % at n = 60 (the sample size of 1-year sampling from an every-6-day sampling program) and to less than ±10 % at n = 200.Similar patterns are observed between Case A (Fig. 8a) and Case B (Fig. 8b) for MRS and OC / EC 10 % .For OC / EC min , a larger bias is observed in Case B than Case A for all sample sizes, as SOC bias by OC / EC min is more sensitive to measurement uncertainty in the range of 0-10 % as shown in Fig. 7b  both decrease with sample size as shown in Fig. 8.The mean SOC bias of OC / EC min decrease with increased sample size while OC / EC 10 % is insensitive to sample size.The sample size dependency of all three (OC / EC) pri representations is not sensitive to f SOC as shown in Fig. S16.Other scenarios considering (OC / EC) pri with a distribution and different f SOC are discussed in the Supplement.

Impact of sampling time resolution
Besides hourly measurements of OC and EC by online aerosol carbon analyzers, the MRS method could also be applied to offline measurements of OC and EC based on filters collected over longer durations (i.e., 24 h), which are more readily available around the world.To explore the impact of sampling duration (e.g., hourly vs. daily), we here use 1-year hourly data at the suburban site of Guangzhou to average them into longer intervals of 2-24 h.The 24 h averaged samples yield a (OC / EC) pri of 2.53, 12 % higher than the (OC / EC) pri derived from hourly data (2.26).This comes as a result of that OC / EC distributions are narrowed when the averaging interval lengthens (Fig. 9), leading to elevation of the MRS-derived (OC / EC) pri .As many PM 2.5 speciation networks adopt a sampling schedule of one 24 h sample every 6 days, we further extract the every-6-day samples to do the MRS calculation.The 1-year data yield six subsets of daily samples, corresponding to six possible schedules of sampling days with the every-6-day sampling frequency.The MRS calculation produces the OC / EC pri in the range of 2.37-2.75(5-22 % higher than the OC / EC pri from the hourly data).This example illustrates that if 24 h sample ECOC data are used, SOC would be biased slightly lower in comparison with those derived from the hourly data.

Caveats of the MRS method in its applications to ambient data
Table 3 summarizes the performance in terms of SOC estimation bias by the different implementations of the EC tracer method, assuming typical variation characteristics for ambient ECOC data.When employing the EC tracer method on ambient samples, it is clear that MRS is preferred since it can provide more accurate SOC estimation.
If the sampling site is dominated by a single primary source (similar to Scenario 1), MRS can perform much better than the traditional OC / EC percentile and minimum approaches.Two issues should be paid attention to when applying MRS: (1) MRS relies on the independence of EC and SOC.This assumption could be invalid if a fraction of SOC is formed from semi-volatile POC (here referred as SOC svP ) (Robinson et al., 2007).Since POC is well correlated with EC, this SOC svP would be attributed to POC by MRS, causing SOC underestimation.The interference of SOC svP will be discussed in a separate paper.(2) OC non-comb will be at- If the sampling site is influenced by two correlated primary sources with distinct (OC / EC) pri (Scenario 2, e.g.urban areas that have vehicular emission from both gasoline and diesel), MRS is still much more reliable than the traditional OC / EC percentile and minimum approaches.If the sampling site is influenced by two independent primary sources with distinct (OC / EC) pri (Scenario 3, e.g.vehicular emission and biomass burning), SOC estimation by MRS is better than the other two conventional methods.But it should be noted that possible bias may exist and the magnitude of bias depends on the relative abundance between the two sources.If tracers are available to demarcate the EC contributions by the different primary sources, unbiased SOC estimation is possible by employing these tracers in MRS.

Conclusions
In this study, the accuracy of SOC estimation by EC tracer method is evaluated by comparing three (OC / EC) pri determination approaches using numerically simulated data.The MRS method has a clear quantitative criterion for the (OC / EC) pri calculation, while the other two commonly used methods, namely minimum OC / EC (OC / EC min ) and OC / EC percentile (e.g.OC / EC 10 % ), are empirical in nature.Three scenarios are considered in the numerical simulations to evaluate the SOC estimation bias by the different EC tracer methods assuming typical variation characteristics for ambient ECOC data.In the scenarios of a single primary source and two well-correlated primary combustion sources, SOC estimates by MRS are unbiased while OC / EC min and OC / EC 10 % consistently underestimate SOC when measurement uncertainty is neglected.When measurement uncertainty is considered, all three approaches produce biased SOC estimates, with MRS producing the smallest bias.The bias by MRS does not exceed 23 % when measurement uncertainty is within 20 % and f SOC is not lower than 20 %.In the scenario of two independent primary sources, SOC by MRS exhibit bias but still perform better than OC / EC min and OC / EC 10 % .If EC from each independent source can be differentiated to allow calculation of individual (OC / EC) pri for each source, unbiased SOC estimation is achievable.Sensitivity tests of OC and EC measurement uncertainty on SOC estimation demonstrate the superior accuracy of MRS over the other two approaches.
Sensitivity tests show that MRS produces mean SOC values with a very small bias for all sample sizes while the precision worsens as the sample size decreases.For a data set with a sample size of 60, SOC bias by MRS is 2 ± 15 %.When the sample is 200, the results by MRS are improved to 2 ± 8 %.It is clear that when employing the EC tracer method to estimate SOC, MRS is preferred over the two conventional methods (OC / EC 10 % and OC / EC min ) since it can provide more accurate SOC estimation.We also evaluated the impact of longer sampling duration on derived (OC / EC) pri and found that if 24 h sample ECOC data are used, SOC would be biased slightly lower in comparison with those derived from the hourly data.
The Supplement related to this article is available online at doi:10.5194/acp-16-5453-2016-supplement.

Figure 3 .Figure 4 .
Figure 3. Conceptual diagram illustrating three scenarios of the relationship between (OC / EC) pri and ambient OC / EC measurements.Both are assumed to be log-normally distributed.(a) Ambient minimum (left tail) is equal to the peak of (OC / EC) pri .(b) Ambient minimum OC / EC (left tail) is larger than the mean of (OC / EC) pri .(c) Ambient minimum OC / EC (left tail) is less than the peak of (OC / EC) pri .

Figure 5 .Figure 6 .
Figure 5. SOC bias in Scenario 2 (two correlated primary emission sources of different (OC / EC) pri ) as estimated by four different EC tracer methods denoted in red, blue and yellow.(a) SOC bias as a function of f EC1 .Results shown here are calculated using f SOC = 40 % as an example.(b) Range of SOC bias shown in boxplots for four f SOC conditions (20, 25, 30 and 40 %).(c) Range of SOC bias shown in boxplots for four γ _pri conditions (2, 4, 6and 8).The symbols in the boxplots are empty circles for average, the line inside the box for median, the box boundaries representing the 75th and the 25th percentile, and the whiskers representing the 95th and 5th percentile.
svP SOC formed from semi-volatile POC γ _pri ratio of the (OC / EC) pri of source 2 to source 1 ε EC , ε OC measurement uncertainty of EC and OC γ unc relative measurement uncertainty γ _RSD the ratio between the RSD values of (OC / EC) pri and EC

www.atmos-chem-phys.net/16/5453/2016/ Atmos. Chem. Phys., 16, 5453-5465, 2016
used a variant of sine functions to simulate POC and EC, which limited the data size to 120, and the frequency distributions of POC and EC exhibited multiple peaks, a characteristic that is not realistic for

Table 2 .
Summary of statistics of OC and EC in ambient samples.

100 80 60 40 20 0 OC/EC percentile (%) 5 4 3 2 1 0 OC/EC pri scale & measured OC/EC 1.0 0.8 0.6 0.4 0.2 0.0 R 2 (SOC vs. EC ) Minimum R 2 Ratio=2.26@24.77% N=7217 Figure 1.
Illustration of the minimum R squared method (MRS) to determine OC / EC pri using 1 year of hourly OC and EC measurements at a suburban site in the Pearl River Delta, China.The red curve shows the correlation coefficient (R 2 ) between SOC and EC as a function of assumed OC / EC pri .The shaded area in tan represents the frequency distribution of the OC / EC ratio for the entire OC and EC data set.The green dashed curve is the cumulative frequency curve of OC / EC ratio.
. The standard deviation of SOC bias by OC / EC min and OC / EC 10 %

Table 3 .
Summary of numerical study results under different scenarios a .Results shown here are obtained assuming the following ambient conditions: RSD EC 50-100 %; f SOC 40-60 %; γ unc 20 %; b "+" represents SOC overestimation and "−" represents underestimation; c MRS : in S3, EC1 and EC2 are used for SOC calculation. a The top x axis represents the number of data point corresponding to the respective data averaging interval.Distributions of OC / EC ratio at various averaging intervals are shown as box plots (empty circles: average, the line inside the box: median, the box boundaries: 75th and the 25th percentile, and the whiskers: 95th and 5th percentile).The red dots represent calculated (OC / EC) pri by MRS.tributed to SOC if only EC is used as a tracer.If OC non-comb is small compared to SOC, such approximation is acceptable.Otherwise quantification of its contribution is needed.If a stable tracer for OC non-comb is available, determination of OC non-comb contribution by MRS is possible, since this scenario is mathematically equivalent to S3 (e.g., relabel EC2 to tracer of OC non-comb and POC to OC non-comb ).