Potential sources and processes affecting speciated atmospheric mercury at Kejimkujik National Park , Canada : comparison of receptor models and data treatment methods

Source apportionment analysis was conducted with positive matrix factorization (PMF) and principal component analysis (PCA) methods using concentrations of speciated mercury (Hg), i.e., gaseous elemental mercury (GEM), gaseous oxidized mercury (GOM), and particulate-bound mercury (PBM), and other air pollutants collected at Kejimkujik National Park, Nova Scotia, Canada, in 2009 and 2010. The results were largely consistent between the 2 years for both methods. The same four source factors were identified in each year using PMF method. In both years, factor photochemistry and re-emission had the largest contributions to atmospheric Hg, while the contributions of combustion emission and industrial sulfur varied slightly between the 2 years. Four components were extracted with air pollutants only in each year using PCA method. Consistencies between the results of PMF and PCA include (1) most or all PMF factors overlapped with PCA components, (2) both methods suggest strong impact of photochemistry but little association between ambient Hg and sea salt, and (3) shifting of PMF source profiles and source contributions from one year to another was echoed in PCA. Inclusion of meteorological parameters led to identification of an additional component, Hg wet deposition in PCA, while it did not affect the identification of other components. The PMF model performance was comparable in 2009 and 2010. Among the three Hg forms, the agreements between model-reproduced and observed annual mean concentrations were excellent for GEM, very good for PBM, and acceptable for GOM. However, on a daily basis, the agreement was very good for GEM but poor for GOM and PBM. Sensitivity tests suggest that increasing sample size by imputation is not effective in improving model performance, while reducing the fraction of concentrations below method detection limit, by either scaling GOM and PBM to higher concentrations or combining them to reactive mercury, is effective. Most of the data treatment options considered had little impact on the source identification or contribution.


Introduction
Atmospheric mercury (Hg) exists in the form of gaseous elemental Hg (GEM) and oxidized Hg, the latter can be in gaseous phase (gaseous oxidized Hg -GOM) or associated with particulate matter (particulate-bound Hg -PBM).Identification of major sources and processes affecting ambient levels of different Hg forms will help mitigate the risks of Hg pollution.Atmospheric Hg can be produced from anthropogenic activities, natural events, and re-emission of previously deposited Hg; the latter two are sometimes grouped together as natural emission sources (Gustin et al., 2008;Pirrone et al., 2010;UNEP, 2013;Gaffney and Marley, 2014;Zhang et al., 2016).Natural events consist of volatilization from the ocean, volcanic eruption, geothermal activities, and weathering of Hg-containing minerals (Pirrone et al., 2010;Gaffney and Marley, 2014).Small-scale or artisanal gold Published by Copernicus Publications on behalf of the European Geosciences Union.1382 X. Xu et al.: Sources and processes affecting atmospheric Hg at Kejimkujik National Park, Canada mining, mining and smelting, and coal combustion are the three major anthropogenic sources (UNEP, 2013;Zhang et al., 2016).Some of the dry and wet deposited PBM and GOM will be reduced to GEM in soil, water, and vegetation surfaces where Hg will be re-emitted in the form of GEM to the atmosphere (Gaffney and Marley, 2014).However, the contributions of each source and process to a given receptor site are affected by many factors including proximity to sources and weather conditions.
Various receptor-based models have been used to identify the sources and processes affecting air pollutant levels.Strengths and weaknesses of some receptor models have been reported previously (e.g., Viana et al., 2008;Watson et al., 2008;Belis et al., 2013).Among these, positive matrix factorization (PMF) and principal component analysis (PCA) are two commonly used methods.PMF method provides quantitative source profiles and source contributions.The resultant source profiles could aid future studies in factor interpretation.Another strength of PMF is input variable screening and provision of model performance measures.The users could specify uncertainty values for each variable in each sample to reduce the impact of measurements with high uncertainties on the final results (US EPA, 2014a; Hopke, 2016).However, in order to derive profiles, PMF requires a large number of air pollutants, which are often unavailable.In contrast, PCA can only provide qualitative assessment of sources/processes; however it cannot determine the source contributions to pollutant concentrations (Hopke, 2015).One advantage of PCA over PMF is its capability of allowing inclusion of meteorological parameters as input, enabling the assessment of the effects of weather conditions on ambient concentrations of, e.g., Hg (Cheng et al., 2015).Therefore, it is beneficial to conduct source apportionment of atmospheric Hg using both PMF and PCA.
Comparisons of results of receptor models for PM source apportionment have been reported, e.g., by Paatero and Tapper (1994), Viana et al. (2008), Belis et al. (2013), andGibson et al. (2015).To date, PCA and PMF have been applied to atmospheric Hg and other air pollutants in Toronto (Canada) (Cheng et al., 2009) and in Rochester, New York (USA) (Huang et al., 2010;Wang et al., 2013).However, both the Toronto and Rochester studies lacked a thorough comparison of the PMF and PCA results.Furthermore, the ability of receptor models to reproduce the observed concentrations should be assessed in order to gauge the model performance (Henry, 1991;Viana et al., 2008;Belis et al., 2015a), which has been rarely reported in the literature.
The overall objective of this study is to identify the factors affecting ambient Hg concentrations at a receptor site using PMF and PCA approaches.The specific objectives are to (1) identify the factors affecting ambient Hg concentrations using PCA and PMF model, (2) summarize the similarity and differences in PMF factors and PCA components, (3) evaluate the PMF model performances by Hg forms, (4) investigate the impact of including meteorological parameters on PCA results, and (5) assess the sensitivity of PMF results and performance to different treatment of missing data and low concentration values of speciated Hg.
The KEJ site is one of the first speciated Hg sites operated by Environment Canada outside the Arctic.This site was selected primarily because of the bioaccumulation issues at this area.Studies have found that common loons in Kejimkujik National Park had the highest mean blood Hg concentrations in northeastern United States and southeastern Canada (Evers et al., 2007).Similarly, a 1996/97 survey found that yellow perch and common loons from Kejimkujik National Park and National Historic Site (Nova Scotia) had the highest blood Hg concentrations across North America.A 2006/07 follow up study on yellow perch observed on average a 29 % increase in 10 out of 16 lakes, although anthropogenic emission from North America decreased between the mid-1990s and the mid-2000s (Wyn et al., 2010).
The sampling site was surrounded by forests on a flat terrain.It was approximately 50 km away from the nearest coast, 120 km southwest of Halifax, and relatively remote from anthropogenic air emissions.A search of the National Pollutant Release Inventory (NPRI, Environment Canada, 2016) yielded seven Nova Scotia facilities reporting Hg air releases in both 2009 and 2010 (Fig. 1).Four of them were electric power generation stations, and the other three were a refinery, a cement plant, and a university.The provincial annual air emissions of Hg were 147.5 and 90.3 kg in 2009 and 2010, respectively (Table S1 in the Supplement).The two largest Hg emitters were Lingan Power Generating Station (2009-2010 average: 71 kg yr −1 ) and Trenton Power Generating Station (26 kg yr −1 ), located 450 and 250 km from the sampling site, respectively.The nearest anthropogenic Hg sources (Dalhousie University, Halifax: 0.17 kg yr −1 ; Imperial Oil, Dartmouth Refinery: 2.8 kg yr −1 ) were 140 km northeast of the sampling site.In addition to Hg sources, the nearby NPRI (Environment Canada, 2016) combustion and industrial sources were a biomass-fueled power station and tire production factory located approximately 50 km eastsoutheast of the KEJ site (Table S1).

Data collection
GEM, GOM, and PBM concentrations were collected from 2009 to 2010 using Tekran ® instruments (models 1130/1135/2537) at 3-hour intervals.Hourly concentrations of ground level ozone (O 3 ) and meteorological parameters (temperature, relative humidity, wind speed, and precipitation amount) as well as daily concentrations of SO 2 and HNO 3 , PM 2.5 (2009 only), and particulate SO 2− 4 , NO − 3 , Mg 2+ , Cl − , K + , Ca 2+ , NH + 4 , and Na + were also collected at KEJ site.Detailed information of data collection can be found in Cheng et al. (2013) and the data access statement at the end of that paper.
Hourly or 3-hour concentrations of GEM, GOM, PBM, O 3 , and meteorological data were averaged into daily values because PMF and PCA require the same interval for all input variables.All daily values were the same as those used in a PCA study by Cheng et al. (2013).The general statistics of the daily concentrations and meteorological parameters are listed in Tables 1 and 2 for year 2009 and year 2010, respectively.The total aerosol mass characterized in 2009 accounted for 80 % of the PM mass.The weather conditions were similar in the 2 years, with an annual mean relative humidity of 88 and 87 % in 2009 and 2010 respectively, moderate wind speeds (4.7 and 4.4 km h −1 ), but a higher precipitation amount (1597 mm yr −1 vs. 1480 mm yr −1 ) and a lower temperature (6.6 • vs. 8.1 • ) in 2009 than 2010.The number of missing daily concentrations ranged from 0 % (ozone, 2010) to 41 % (PBM, 2009), which are excluded from PMF or PCA.Among the three Hg forms, GEM had the fewest values below the method detection limit (MDL), while GOM had the largest percentages of concentrations below MDL, followed by PBM, in both years.The variability, as indicated by coefficient of variability, was low for GEM but much higher for GOM and PBM.

Model setup and case design
Detailed description of the theory of PMF and PCA methods can be found in Cheng et al. (2015).Model setup and case design are described below.

PMF
EPA PMF5.0 (US EPA, 2014b) was used in this study.The 12 cases investigated are listed in Table 3.Two approaches were employed in PMF modeling to handle missing values.The first approach is listwise deletion.Listwise deletion excludes all the records having one or more missing values, resulting in a complete data matrix as required in PMF.However, it may cause a large reduction of the dataset when one of the pollutants has many missing values or several pollutants have missing values at different time periods.In environmental studies, this approach may lead to biased results because listwise deletion benefits the records with high concentrations when below MDL values are flagged as missing (Huang et al., 1999).The second method is imputation, which increases the sample size in PMF.Hedberg et al. (2005) found that the relative error of factor profiles deceased as the sample size increased.In this study, geometric mean imputation and median imputation were used to minimize the undue influence of extreme values as in Pekey et al. (2004).The effects of imputation were investigated in cases 09 + Mean, 10 + Mean, 09 + Median, and 10 + Median (Table 3).
Cases 09 + RM, 10 + RM, 09 − RM, and 10 − RM (Table 3) were devised to investigate the effects of excluding or combining GOM and PBM into reactive mercury (RM) on the PMF results compared with the full dataset.Uncertainties of GOM and PBM measurements are considered high (Gustin et al., 2015).It has been reported that GOM may be collected on the PBM filter, and thus GOM concentrations could be biased low (Lynam and Keeler, 2005).Therefore, combining GOM and PBM to RM may reduce the uncertainties (Cheng et al., 2016).RM was calculated by summing GOM and PBM when both forms of Hg are detected.
In Case 09ScaleRM and Case 10ScaleRM, a variable scaling factor was used to increase the GOM and PBM concentrations: where x i is the concentration of GOM or PBM in the ith sample.The scaling factor is large when the concentration is low, and vice versa, but the maximum concentration is unchanged.
Equation-based uncertainties (US EPA, 2014a) were used in this study, expressed as when concentration > MDL.
For RM, the MDL was assumed to be 4 pg m −3 , which is a summation of MDLs of GOM and PBM (2 pg m −3 each).
The error fractions were assumed to be 15 % of concentrations for Hg forms and 10 % of concentrations for other compounds.This is because most of the measured GOM and PBM concentrations have low concentrations near or below MDL as seen in Tables 1-2; thus they have large uncertainties, as pointed out by Croghan and Egeghy (2003).
Following Polissar et al. (1998), constant uncertainties (100, 200, and 1000 % of the mean/median for GEM, PBM, and GOM, respectively) were used for imputed Hg concentrations, based on the uncertainty distributions of the below MDL values in the two base cases.This is to down-weight the imputed values.
The so-called "total variable" (e.g., PM) was not used because this study focused on speciated Hg and input variables also include both PM ions and gaseous pollutants.No variables or samples were excluded after input data screening to reflect all observations.No variables were down-weighted, with the exception of imputed values, because runs with and without GOM and PBM categorized as "weak" led to similar results.Other PMF input parameters include the number of runs being set to 20 to enable stability evaluation and the best run being used; the number of the starting seed was set to 25.
PMF outputs used in this study include source profiles, model performances, and factor contributions. Different numbers of factors were also analyzed and the four-factor results had the best interpretability (Liao, 2016).Therefore, four factors were retained in each case.Detailed analysis is presented as Supplement (S), which supports the stability of PMF runs and justifies the final solution and the number of factors chosen.The factors were interpreted based on the comparison of the major variables (> = 25 %) in each of the four factors to markers and source profiles in the literature, taking into consideration NPRI emission sources.
Various methods have been employed to evaluate receptor models' performance (e.g., Belis et al., 2015a, b;Cesari et al., 2016).In this study, stability indexes of model runs, scaled residual plot, observation-prediction scatter plot, and observation-prediction time series were used to evaluate the model performances for speciated Hg.The impact of each data treatment method on PMF results was assessed, taking into consideration interpretability of the factors and model performance of the three Hg forms.

PCA
The PCA source apportionment analysis using speciated Hg in 2009 and 2010 was already conducted in another study (Cheng et al., 2013).In this study, different cases were investigated, as listed in Table 4. Briefly, all compounds were included to enable comparison with PMF results (Case 2009 and Case 2010), instead of removing some air pollutants as in Cheng et al. (2013) due to a lack of correlation between those air pollutants and atmospheric Hg.Pairwise deletion of missing values in Cheng et al. (2013) was replaced with listwise deletion to be consistent with the PMF model input which must be a complete data matrix.As shown in Table 4, there is a requirement of sample size in order to obtain statistically stable source apportionment results (Henry et al., 1984;Thurston and Spengler, 1985).Our datasets meet the more restrictive requirement by Thurston and Spengler (1985) in both years, by a margin of 90-300 in 2009 and 216-300 in 2010 (Tables 3-4).
The PCA runs were conducted using SPSS 22.0 (IBM Corp., USA).Cases 09-C&M and 10-C&M were included to evaluate the effects of weather conditions on factor identification.The dimensions of the reference cases in PMF model and PCA are the same.After including the meteorological parameters in PCA input, the dimensions of the input data are slightly smaller.The components with eigenvalues greater than 1 were retained for further analysis, following the Kaiser criterion (Kaiser, 1960).An oblique rotation method was used to verify the inter-correlations between the components.Principal components after varimax rotation were interpreted by comparing the major variables (loadings > 0.25) of the component with the outcomes of other studies and by checking NPRI sources in the region (Table S1).

PMF -base cases
In this section, only the two base cases, Case 2009 and Case 2010, are considered.

PMF sources
Table 5 and Figs.2-3 present percent concentration of each pollutant apportioned to each of the four factors.Factor 1 was named combustion emission due to large contributions of SO 2− 4 (64 %) and HNO 3 (54 %) and a moderate contribution of GOM (31 %) (Table 5).Combustion emission includes fuel combustion and biomass burning.The small contribution of K + (22 %) in this factor suggests a minor impact of biomass burning.SO 2 and NO x are precursors of SO 2− 4 and HNO 3 , respectively.These precursors are from combustion sources and probably oxidized during the transport from sources to receptor sites (Liu et al., 2007).The presence of GOM is consistent with the combustion emission which is one of the GOM sources (Carpi, 1997).There were little NH 3 emissions from point sources near the study site (Table S1).Thus, the presence of NH + 4 (71 %) should be related to the transport and transformation of NH 3 from agriculture emissions as well as other physical and chemical processes (e.g., aqueous phase chemistry, condensational growth, droplet evaporation) producing NH + 4 (Zhang et al., 2008;Pitchford et al., 2009).In this factor, the molar ra- tio of NH + 4 to SO 2− 4 is 1.7, although some observed profiles have ratios greater than 2 (Lee et al, 1999).Ratios less than 2 suggest insufficient amounts of NH 3 to neutralize H 2 SO 4 ; thus H 2 SO 4 will react with other compounds to form sulfate (Pavlovic et al., 2006;Zhang et al., 2008).The moderate contribution of PM (42 %) is consistent with the presence of particulate SO 2− 4 and NH + 4 .Also, SO 2− 4 accounted for over 50 % of PM mass (Table 1).In addition to a lack of major combustion facilities nearby (Table S1), a strong correlation between SO 2− 4 and NH + 4 (Tables S2-S3) also suggests formation of secondary aerosols.Therefore, this factor suggests transported plumes instead of fresh emissions.
Factor 2 was assigned to industrial sulfur.The major variables PBM and SO 2 are indicators of coal combustion (Huang et al., 2010).The minor contributions of HNO 3 and NO − 3 also suggest combustion sources because their precursor, NO x , is mainly released by combustion sources (Liu et al., 2007).However, there were no combustion sources emitting Hg compounds near the KEJ site in 2009 (Table S1).Therefore, this factor is more likely related to industrial sources in the region.As shown in Table S1, point sources of industrial sulfur in the province of Nova Scotia include tire production, engineered wood production, food industry, and universities.Coal-fired power plants and metal production are major sources of sulfur; however, there are no combustion sources close to the sampling site.These sources are located in eastern US, which could be transported to the site.Mobile sources of sulfur are ships and vessels from nearby ports (Cheng et al., 2013).
Factor 3 was named photochemical process and reemission of Hg due to the high contributions of ozone (72 %), GEM (76 %), GOM (69 %), PBM (63 %), and moderate contributions of Ca 2+ (45 %) and K + (37 %).The high contribution of ozone indicates an ozone rich environment, resulting in oxidation of GEM to GOM and the sequential formation of PBM (Pal and Ariva, 2004;Liu et al., 2007).Although results of recent studies show that the reaction rate of Hg and ozone has large uncertainties, the oxidation of Hg by bromine is very fast (Goodsite et al., 2004).The KEJ site is near the Atlantic, making the oxidation of Hg by bromine applicable.The presence of K + is related to soil emission or biomass burning (Andersen et al., 2007), while Ca 2+ is related to soil/crust.The site is located in Kejimkujik National Park.Therefore, it is under the impact of soil emission, emission from the nearby biomass-fired power station (Table S1), and transported biomass combustion.It was estimated that re-emission of Hg from biomass burning and land surfaces contributed 13 and 34 % of the global re-emission budget, respectively (Pirrone et al., 2010).Thus, the high contribution of GEM may be attributable to the re-emission of GEM.The emission from soil and biomass combustion was also identified in the PCA study at this site (Cheng et al., 2013).An examination of the time series factor profiles revealed that model-reproduced K + , O 3 , GEM, GOM, and PBM concentrations (in this factor) were rather smooth.The impact of biomass burning seems to be small in this factor due to a lack of high K + , O 3 , and Hg concentration periods or episodes identified.The relatively stable patterns of K + and GEM suggest re-emission of GEM, while GOM was high in Factor 4 has high contributions of Cl − (100 %), Mg 2+ (83 %), and Na + (86 %) and moderate contributions of Ca 2+ (31 %), K + (39 %), and NO − 3 (40 %).The presence of Na + , Mg 2+ , and Cl − indicates marine aerosols because these elements are rich in seawater (Huang et al., 1999).The strong correlations among these three compounds (≥ 0.89, Tables S2-S3) also suggest a common source.As the sampling site is located near the Atlantic, the presence of marine aerosols is reasonable.Major production pathways of NO − 3 include reaction of HNO 3 with NH 3 , sea salt and soil dust (Pakkanen, 1996).In this factor, the NO − 3 is probably related to the reaction of HNO 3 and sea salt.Thus, this factor was named sea salt.As seen in Table 5 and Figs.2-3, the same four factors were identified in years 2009 and 2010.The profiles of each factor were also largely consistent between the 2 years.Factor 1 in 2010 is similar to the factor named combustion emission in Case 2009.However, this factor lacks PM (not available in 2010) and has a higher contribution from K + , which may relate to biomass burning.This factor is assigned to the same name as in 2009 because the presence of SO 2− 4 and HNO 3 is enough to identify combustion process (Liu et al., 2007).It should be noted that this factor has a much smaller contribution of GOM than in 2009.This may be due to a large reduction in SO 2 emissions (2.42 million tons or 32 % reduction) from coal-fired power plants across the United States between 2008 and 2010 (US EPA, 2011).Large reductions in Hg (−39 %) and SO 2 (−35 %) emissions also occurred in Nova Scotia between 2009 and 2010, as seen in Table S1.However, reduction in Hg emissions is only reflected on GOM (−75 %), while GEM decreased a little and PBM increased slightly.Moreover, the long-term effects of emission reductions on Hg concentrations and source contributions should be investigated.
The major variables of factor 2 are also similar to those of the factor industrial sulfur in Case 2009.However, this factor has a moderate contribution of GOM instead of PBM in 2009.Factor 3 has similar major variables as the factor named photochemistry and re-emission in Case 2009.Factor 4 is dominated by Cl − (100 %), Na + (83 %) and Mg 2+ (75 %).This factor was named sea salt as in Case 2009.

PMF source contributions
The PMF factor contributions of the two base cases are presented in Table S4 (Case 2009) and Table S5 (Case 2010).In both years, factor photochemistry and re-emission had the largest contributions to GEM (averaging 77 and 79 % in 2009 and 2010, respectively), GOM (73 and 67 %), and PBM (69 and 80 %) among all four factors.In other words, ambient Hg concentrations at the KEJ site were dominated by photochemistry and re-emission of Hg.Industrial sulfur had moderate contributions to GOM (average, 29 %) in 2010 instead of PBM in 2009 (22 %).Combustion emission contributed 26 % of GOM in 2009 but 11 % each of GEM and PBM in 2010.The factor sea salt only had minor contribution to GEM (14 % in 2009 and 9 % in 2010) and PBM (< 10 % in both years).This is not unexpected because GEM is likely to be oxidized to GOM by the in situ photochemical process under the bromine-rich environment (Obrist et al., 2011).However, this factor has no contribution to GOM because it was estimated that > 80 % of GOM in the marine boundary layer is absorbed by sea salt aerosols and it is sequentially deposited onto the Earth's surface where evasion occurs (Holmes et al., 2009).

PMF model performance
Among the three Hg forms, GEM had the best performances in terms of scaled (i.e., standardized) residual because it had normal distribution and fewer absolute values of scaled residual greater than 3 in both years (Case 2009 and Case 2010, Table 6).Table 6 also lists the coefficient of determination (R 2 ) and the slope of the regression line for speciated Hg in observation-prediction scatter plot (Figs.S5-S6 in the Supplement) to evaluate the overall model-measurement agreement.Between the 2 years, the agreement was better with GEM in 2010 and PBM in 2009 because of higher R 2 values and slope closer to 1.The low values of R 2 and slope in both years indicate the agreement was poor for GOM.
The observation-prediction time series of the three Hg forms reveal the model's ability to reproduce the observational concentrations on a day-to-day basis.In Case 2009, the observation-prediction time series (Fig. S7) were split into three time periods by the data gaps, January to February (period 1), March to July (period 2), and October to December (period 3).GEM had better performances than the other two forms because the peak values were reproduced by the model in all three periods.However, the modeled values in period 3 are too low compared to observed concentrations, leading to a lower R 2 (Table 6).The performance for PBM is better than GOM because the model-reproduced concentrations tracked the observed concentrations well in period 2. However, PBM concentrations were underestimated and overestimated by the model in period 1 and period 3, respectively.The GOM concentrations were not reproduced well with unmatched peak values in period 2, and there was a clear separation of observed and model-reproduced trend lines in periods 1 and 3, leading to overprediction.
In Case 2010, the time series (Fig. S8) were split into two periods, January-June (period 1) and July-December (period 2), based on a clearly visible overestimation of GOM concentrations in the second period.The reproduced GEM concentrations tracked the trend of observations well in both periods but with more fluctuations.The model was unable to reproduce high GOM concentrations in period 1.For PBM, the reproduced concentration was rather flat, missing completely the high concentration episode in spring 2010.
The model-measurement agreement was further quantified with the ratios of reproduced to observed concentrations (Fig. 4).In both years, the reproduced GEM agreed well with the observed concentrations as supported by the small range of ratios (0.56-1.32 in 2009, 0.42-1.43 in 2010) and mean ratios approaching 1 (0.97 and 0.98).On an annual basis, the observed GEM concentrations were also well reproduced because the ratios of reproduced to observed annual means were almost 1 (0.97 and 0.98) (Tables S4-S5).Compared with GOM, PBM had better agreement between the reproduced and observed concentrations with a smaller range of ratios (0.40-13.4 and 0.14-18.3vs. 0.13-53 and 0-193) and mean ratios closer to 1 (2.09 and 1.98 vs. 5.89 and 4.44).In spite of large sample-to-sample variability in the reproduced / observed ratios, the model performance was very good for PBM (the ratios of reproduced to observed annual means being 1.03 and 1; Tables S4-S5) and reasonable for GOM (0.86 and 1.34) in reproducing annual means.

Comparison between PMF in year 2009 and 2010
Overall, the interpretability of the factors was similar in the 2 years.The same factors were observed in 2009 and 2010, and most factor contributions were highly consistent between the 2 years.Among the three Hg forms, PMF reproduced GEM concentrations well in both years.Possible reasons of poor performance on PBM and GOM include PMF uncertainties for modeling pollutants that undergo various transformation processes, unlike the modeling of only aerosols.PMF does not account for chemical reactions that may oc-  cur as the air mass travels from source to receptor.Another likely reason is lower concentration levels and much higher percentages of readings below MDL (Tables 1-2), leading to large uncertainties.However, the differences in sample size (161 in 2009 vs. 290 in 2010) and fractions of below MDL values (Tables 1-2) alone could not explain the mixed results of poor performance on GOM in 2009 and PBM in 2010.Further examination of time series (Figs.S7 and S8) suggests that the reduced performance could also be attributable to high concentration episodes in GOM in 2009 and PBM in 2010.The impact of Hg data treatment on PMF results was investigated and the results are presented in Sect.3.4.

Case 09-C
The component loadings of Case 09-C are presented in Table 7. PC1 was named combustion/industrial emission due to positive loadings of PBM, PM, O 3 , SO 2 , HNO 3 , Ca 2+ , K + , NO − 3 , NH + 4 , and SO 2− 4 .Most major compounds except O 3 were also found in a component named "transport of combustion and industrial emissions" in another PCA study using the same dataset (Cheng et al., 2013).The high loadings of secondary pollutants HNO 3 , NO − 3 , and SO 2− 4 indicate the factor represents transport of combustion/industrial emission because their precursors (NO x and SO 2 ) are mainly emitted by combustion/industrial sources (Liu et al., 2007).The precursors may be oxidized during the transport process.The moderate loading of O 3 is also related to the transport of combustion emission because the precursors of O 3 (NO x and VOC) are emitted from mobile and stationary combustion sources.Ammonia is likely related to the transport of agriculture emissions and reaction of NH 3 and H 2 SO 4 or HNO 3 (Pichford et al., 2009).
PC2 has high loadings of Na + , Mg 2+ , Cl − , and K + and moderate loadings of Ca 2+ .Those compounds indicate marine aerosols (Huang et al., 1999).The moderate loading of NO − 3 is likely due to the reaction of HNO 3 and sea salt (Pakkanen, 1996).As in the PMF factor interpretation, the identification of component sea salt is relevant because the monitoring site is near the Atlantic.
PC3 has positive loadings of GEM, GOM, PBM, and O 3 .The positive loadings on O 3 and GOM indicate the photochemical production of GOM (Huang et al., 2010).The positive loading of GEM is somewhat unexpected be-cause the photochemical production of GOM consumes GEM, thus leading to opposite signs of GEM and GOM (e.g., Huang et al., 2010).However, daily average concentrations were used in this study instead of 2 h means in Huang et al. (2010).The daily GEM and GOM were indeed positively correlated (r = 0.37 in 2009, Table S2; 0.31 in 2010, Table S3).Using the same dataset, Cheng et al. (2013) conducted further analysis on O 3 concentrations and %GOM / TGM (TGM = GEM + GOM) ratios.The ratio is indicative of the degree of oxidation.The results showed that the %GOM / TGM ratio increased with O 3 when O 3 concentrations were greater than 40 ppb, suggesting gas phase oxidation of Hg at this coastal site.Therefore, this factor was named photochemical production of GOM.PC4 represents gas-particle partitioning of Hg.The negative loading of PBM and the positive loading of GOM indicate the partition process.The positive loadings of Ca 2+ and K + suggest soil aerosols (Cheng et al., 2012) which could be abundant at the Kejimkujik National Park.
Three out of four components (combustion/industrial emission, photochemical production of GOM, and gasparticle partitioning of Hg) have significant association with ambient Hg concentrations at the site, while sea salt has little.

Case 09-C&M
Five principal components are extracted when meteorological data were included in PCA (Case 09-C&M, Table 7).The loadings in PC1-PC4 are similar with the loadings of PC1, PC2, PC4, and PC3 in Case 09-C, respectively.Thus the names of those four components were retained.The inclusion of meteorological parameters resulted in small loadings of relative humidity (−0.26) in PC1 and wind speed (0.32) in PC2, as well as a moderate loading of wind speed (0.52) in PC4.A large loading of temperature (0.94) was observed in PC3.The opposite signs of temperature and PBM are consistent with the gas-particle partitioning process because low temperatures favor the formation of PBM (Rutter and Schauer, 2007).The lack of GEM in PC3 (Case 09-C&M) did not affect the identification of this factor, because the partitioning of GEM onto particles is much weaker than that of GOM (Liu et al., 2007).
PC5 was derived mostly from meteorological variables.The negative loading of GOM and positive loadings of relative humidity and precipitation suggest removal of GOM by dew, cloud, and precipitation (Cheng et al., 2013).The loading of GOM is small but nonetheless consistent with the wet deposition process because GOM is more easily removed compared to GEM due to its higher water solubility (Gaffney and Marley, 2014).Therefore, this component was named Hg wet deposition.
Similar to Case 09-C, all components except sea salt are associated with ambient Hg concentrations.After the inclusion of meteorological data, each factor contains at least one meteorological parameter.The presence of meteorolog- ical variables did not contribute to the determination of the components except a new component Hg wet deposition was identified.

Case 10-C
The component loadings of Case 10-C are listed in Table 8.PC1 was named combustion emission.The positive loadings of HNO 3 , NO − 3 , and SO 2− 4 are indicative of transport of combustion emission because their precursors (NO 2 and SO 2 ) are mainly released by combustion emissions (Liu et al., 2007).The high positive loading of NH + 4 represents transport of agriculture emissions of ammonia which may react with H 2 SO 4 and HNO 3 during the transport process (Pitchford et al., 2009).The positive loadings of Ca 2+ and K + indicate biomass burning from wildfires or biomass-fueled power station (Andersen et al., 2007).
PC2 was named sea salt due to high loadings of Na + , Mg 2+ , and Cl − , because these three compounds are rich in seawater (Huang et al., 1999).PC3 has the same major variables as the component photochemical process of GOM in 2009.Therefore, PC3 was also named as such.
PC4 was assigned to industrial source.The positive loadings of GOM and SO 2 indicate coal combustion (Lynam and Keeler, 2006), although no combustion facilities were reported near the KEJ site in 2010 (Table S1).The positive loadings of SO 2− 4 and HNO 3 are consistent with the transport of industrial emissions which release their precursors, SO 2 and NO x (Liu et al., 2007).Therefore, this factor was named industrial source.Two out of four factors (i.e., pho-tochemical production of GOM and industrial source) have significant association with Hg compounds.

Case 10-C&M
As shown in Table 8, five principal components are extracted in Case 10-C&M.The loadings in PC1-PC3 and PC5 are similar with the loadings of PC1-PC4 in Case 10-C, respectively.Thus the names of those four components were retained.The additional negative loading of temperature (−0.52,Table 8) and positive loading of wind speed (0.52, Table 8) in PC3 may indicate colder air flows from the north containing more O 3 and GOM (Cheng et al., 2013).This is reasonable because Hg sources in Nova Scotia were mainly located north of the sampling site (Fig. 1).PC4 in Case 10-C&M was named Hg wet deposition due to negative loadings of GOM and PBM and positive loadings of relative humidity, wind speed, and precipitation, similar to PC5 in Case 09-C&M (Table 7).Three out of five components (i.e., photochemical production of GOM, industrial source, and Hg wet deposition) were associated with Hg concentrations.The influence of meteorological data on identification of components was also similar to 2009.For Case 09-C&M 10-C&M, a detailed comparison of PCA results in this study and in Cheng et al. (2013) can be found in Liao (2016).

Comparison between PCA in year 2009 and 2010
In each year, four components were extracted in PCA with air pollutants only.The two common factors between the 2 years are photochemical process of GOM and sea salt.The former has a strong association with Hg compounds, while the latter has little.Component gas-particle partitioning of Hg was only identified in 2009, likely due to a lower percentage of PBM readings < MDL than those in 2010 (Table 9, Case 2009 and 2010).It is also consistent with strong correlations between temperature as well as GOM and PBM (r = 0.46 and −0.43,Table S2) in 2009 but nonsignificant or weak correlations (r = −0.04 and −0.16,Table S3) in 2010.
The component combustion/industrial emission in 2009 affected PBM and SO 2 levels.It was split into two components in 2010, combustion emission and industrial source.The former was no longer strongly associated with any of the three Hg forms, while the latter was associated with GOM and SO 2 .This is probably due to the reduction of coal combustion in Canada and the USA, evident from lower provincial Hg (reduction of 39 %) and SO 2 emissions (−35 %) in 2010 (Table S1).The reductions in GEM, GOM, and SO 2 concentrations at the KEJ site were 3, 75, and 43 %, respectively, in 2010 (Tables 1-2).The shifting of PBM and SO 2 relationship in 2009 to GOM and SO 2 in 2010 is sustained by a strong correlation between PBM and SO 2 (r = 0.63, Table S2) in 2009, but little correlation (r = 0.06) accompanied by a moderate correlation between GOM and SO 2 (r = 0.30) (Table S3) in 2010.The shift is also consistent with the PMF results where industrial sulfur accounted for 21 % of PBM in 2009 (Table S4) but 29 % of GOM in 2010 (Table S5).
In both years, inclusion of meteorological parameters did not affect the identification of the four factors from air concentrations.However, relative humidity and precipitation yielded an additional component named Hg wet deposition.
Overall, the PCA results were largely consistent between the 2 years, in terms of the number of components, impact of meteorological parameters, and major processes associated with ambient Hg.The changing emissions/concentrations and the resultant correlations among monitored air pollutants from 1 year to another are reflected in the limited shifting of variable loadings.

Comparison of PMF and PCA results
The PCA loadings and the factor profiles as well as factor contributions in PMF model have very different meanings.In PCA, variables with large loading indicate their correlation or association with that component derived from all samples.In PMF, presence of variables in profiles indicates their contribution to that source/process derived from all samples, while the contribution values are further quantified in source contribution tables of each sample.Therefore, a direct comparison between the PMF and PCA results is not feasible.However, the similarities and differences in the major sources/processes identified by each approach, chemical markers in each factor profile or component, and the impact or association of factors and components on Hg could reveal strength and weakness of each method.
A comparison of Tables 5 and 7-8 (cases with air concentrations only) shows that Na + , Cl  the variables in factors and components differed to some extent.Furthermore, combustion and industrial were separate sources in PMF in both years and in PCA in 2010, but combined as one component in PCA in 2009.Overall, PMF profiles are more consistent between the 2 years, while the PCA loadings are more sensitive to correlation among variables.However, the shift of PBM and SO 2 to GOM and SO 2 loadings in PCA between the 2 years is consistent with the shift of those two pairs in combustion and industrial sulfur profiles/contributions in PMF.However, gas-particle partitioning of Hg was only recognized in PCA.This is because the identification of this factor relies on negative association between PBM and GOM (Table 7), but such association is not reflected in PMF due to its nonnegative nature.This is one of the limitations of PMF.Furthermore, the inclusion of meteorological conditions in PCA enables identification of a new component related to weather conditions.The good agreement between PMF and PCA outputs is consistent with a comparison of receptor models in PM source appointment (Viana et al., 2007;Cesari et al., 2016).A common weakness of PCA and PMF is the suggestiveness of factors/components.Other techniques, such as back trajectories, have been used in previous studies to verify some factors (Cheng et al., 2015).Overall, when accompanied by model performance evaluation, PMF results are with more confidence.

Case 10ScaleRM
The factor profiles and contributions of Case 10ScaleRM are similar to those in Case 2010 (Fig. 3, Table S5).A noticeable deviation is the much smaller contribution by GOM in factor 2 compared to Case 2010.However, factor 2 was still assigned to industrial sulfur because of the presence of SO 2 and NO − 3 .

Performance
Firstly, the distribution of scaled residuals as well as R 2 value and the slope of the regression line for speciated Hg in observation-prediction scatter plot were evaluated for the six cases (Table 6, Fig. S6).Similar to 2009, the comparable performances observed in Case 2010, Case 10 − RM, Case 10 + RM, and Case 10ScaleRM indicate that the model performance on GEM is insensitive to excluding, scaling, or combining GOM and PBM to RM. Case 10ScaleRM also has the best performances on GOM and PBM among all the cases in 2010.Unlike in 2009, the negative impact of imputation was smaller when median value was used compared with the mean imputation.Secondly, in the observed and reproduced time series (Fig. S8), imputation resulted in more severe fluctuation in reproduced GEM concentration as in 2009, but less so when median values were used.Scaling of GOM or PBM also improved the reproducibility of day-to-day variability in the observed values, owing to a large reduction in concentrations below MDL (Table 9).Among the six cases, the most significant change is in PBM with imputation.There were additional high concentration episodes in early 2010 when imputation of non-Hg compounds brought back Hg concentrations otherwise removed by listwise deletion in the base case, leading to increased standard deviation (Table 9).Those peaks were completely missed by the model, leading to deteriorated agreement.
Finally, the distributions of the ratios of reproduced to observe Hg concentrations and the ratio of reproduced to observe annual means changed little among the first five cases in 2010 (Fig. 4 and Table S5).The exceptions are underprediction of the annual mean of PBM in the two imputation cases and overprediction for RM.Compared with the base case, the distribution of ratios for GOM and PBM became narrower and shifted toward smaller values, but leading to underprediction of PBM.

Comparison of 2009 and 2010 among different data treatments
The different characteristics of Hg forms led to different impact of data treatment on model results and performances in the 2 years.Imputation using geometric mean and median values led to minor changes in factor profiles in both years, with more variations in contributions of Hg forms in 2009 but non-mercury compounds in 2010.This is likely because the Hg and non-Hg compounds were missing at a larger percentage in 2009 and 2010, respectively.The lack of significant impact is likely due to already high sampleto-compound ratios (161 samples/15 compounds in 2009, 290 samples/14 compounds in 2010, Tables 1-3).Huang et al. (1999) have reported that mean imputation generally yielded better PMF results than listwise deletion, especially with higher percentage of missing values.Particularly, composition of crustal and marine factors were closer to those of crust and seawater.Imputation resulted in degraded performance on all three Hg forms, but for different reasons.For GEM, it is largely due to more fluctuation than the already overpredicted one in the base case in both years.For PBM in 2010, the peak values otherwise removed in listwise deletion (base case) are beyond the model's ability to match.This seems to be a random occurrence and is an uncertainty of imputation.Between geometric mean and median imputations, the impact was similar in both years for each of the three Hg forms.The exception is with median imputation in 2010: there was less deviation in factor profile/contribution from the base case.The reason is unclear because the difference in geometric mean and median was very small for GEM in both years and slightly greater in 2009 for GOM and PBM (Tables 1-2).
In both years, some changes in the factor profiles and factor contributions but little changes in model performances were observed in the cases excluding GOM and PBM.Scaling GOM and PBM or combining them into RM improved model-measurement agreement, suggesting the approach is effective in both years in spite of large percentages of below MDL values (GOM, 78 % in 2009(GOM, 78 % in vs. 96 % in 2010;;PBM, 48 % in 2009PBM, 48 % in vs. 46 % in 2010; Tables 1-2).The improvement is largely attributable to reduction in concentrations below MDL (Table 9) which in turn reduced PMF uncertainty expressed in Eq. ( 2).Another benefit of using a variable scaling factor is reduced data variability as indicated by smaller coefficients of variation in Table 9. PMF is better at reproducing compounds with less variability.However, there is little evidence that the scientific uncertainties of scaled GOM and PBM concentrations are indeed reduced from that of the original dataset.

Conclusions
Source apportionment analysis was conducted with PMF and PCA using concentrations of speciated Hg and other air pollutants collected at KEJ site in 2009 and 2010.Year 2010 was characterized by reduced Hg and SO 2 emissions compared with 2009.However, GOM is more sensitive to the decrease in Hg emissions while GEM and PBM are not, underscoring the benefits of speciated Hg measurements.It was found that consideration of emission inventories and correlation among air pollutants is useful in factor/component interpretation.
Using PMF, the nature of each of the four factors identified was the same in 2009 and 2010.In both years, ambient concentration of all three Hg forms at the KEJ site were dominated by contributions from factor photochemistry and re-emission, and the contribution by sea salt was the smallest.However, slight variations between the 2 years were observed in the contributions by the other two factors (combustion emission, industrial sulfur).
Good agreement was found between PMF and PCA results.In each year, four components were extracted in PCA with air pollutants only.Three or four of them overlapped with factors obtained in PMF.PCA results suggest little association between Hg and sea salt, consistent with PMF.Furthermore, PMF and PCA had similar shift of source profile/contribution from one year to another, suggesting both methods were able to respond to changing concentration levels, and interrelationships among the air pollutants.In both years, inclusion of meteorological parameters in PCA led to extraction of an additional component Hg wet deposition while the identification of other components was not affected.Therefore, PCA is superb to PMF in terms of identifying factors related to atmospheric processes.With regards to atmospheric processes represented by negative correlation among variables, such as gas-particle partitioning of Hg (Table 8), PCA is more likely to identify them because component loadings reflect correlations, while it is difficult for PMF because its variable contributions in source profile are all positive.
A comprehensive PMF model performance evaluation was conducted for each of the three Hg forms.Between the 2 years, the model performance was comparable.In both years, the observed daily GEM concentrations were well reproduced, but relatively poor for GOM and PBM.On an annual basis, the model-measurement agreements of annual mean concentrations were excellent for GEM, very good for PBM, and acceptable for GOM.
The sensitivity of PMF results and model performance to different approaches of dealing with missing values and concentrations with large uncertainties was investigated.In our study of more than 160 samples with 15 or 14 air pollutants, increasing the sample size by geometric mean or median imputation of missing values is not effective in improving the model performance.With over 70 % GOM and over 40 % PBM concentrations below MDL in our dataset, the impact of large measurement uncertainties in GOM and PBM is much more significant.Scaling GOM and PBM to increase their concentrations or combining them to reactive mercury is effective in improving the model-measurement agreement.The identification of sources/processes and their contributions to speciated Hg are relatively insensitive to any of the data treatment options considered.The exception is that less sources/processes affecting ambient Hg were identified when GOM and PBM were excluded, further underlining the importance of monitoring speciated Hg.
The good agreement between PCA and PMF results in both years is encouraging although these two methods bear little resemblance.PMF partitions observed concentrations by solving mass balance equations, while PCA is a data reduction tool to explain majority of variances in the entire dataset with a small number of components.Our observation was made possible by the use of multiple-year dataset.Future studies should conduct more PMF and PCA comparisons to validate our findings.
Overall, PMF results are quantitative and with more confidence with model performance evaluation.However, when ancillary air pollutant data are available, it is recommended to carry out both PCA and PMF simulations to verify the sources/processes identified.
Our PMF results suggest that PMF has difficulties reproducing daily concentrations of GOM and PBM because of high concentration episodes and large uncertainties due to low concentrations and large percentage of below MDL values.More attention should be devoted to those issues in future studies.

Data availability
The datasets can be accessed by contacting the authors.Alternatively, the speciated atmospheric Hg data can be accessed through the National Atmospheric Deposition Program's AMNet (Atmospheric Mercury Network) website; particulate inorganic ions and trace gases (SO 2 , HNO 3 , O 3 ) can be accessed through Environment Canada's NatChem (Canadian National Atmospheric Chemistry) website; PM 2.5 can be obtained from Environment Canada's NAPS (National Air Pollution Surveillance) network website; meteorological data can be obtained from Environment Canada's Historical Climate Data website.

Figure 1 .
Figure 1.Map showing the locations of sampling site (green ), the top 19 SO 2 or NO x point sources (average of 2009 and 2010) (blue ), and all Hg point sources in 2009 and 2010 (red ), in Nova Scotia, Canada.

Figure 4 .
Figure 4. Box plot of reproduced to observed concentration ratios (upper whisker -upper 25 % of the distribution excluding outliers; interquartile range box -middle 50 % of the data; horizontal line in the box -median; lower whisker -lower 25 % of the distribution excluding outliers; ⊕ -mean).

Table 1 .
General statistics of daily air pollutant concentrations (in µg m −3 unless otherwise noted) and meteorological parameters in 2009.

Table 3 .
PMF case design with different treatments of speciated Hg data.

Table 4 .
PCA input and setup.

Table 6 .
PMF model performances on speciated mercury in 2009 and 2010.
− , and Mg 2+ are markers of sea salt in both PMF and PCA.Similarly, GEM, GOM, PBM, and O 3 indicate photochemistry.Both methods suggest strong contribution to or association between Hg compounds and photochemistry, but weak with sea salt.Both methods identified combustion and industrial sources, while www.atmos-chem-phys.net/17/1381/2017/Atmos.Chem.Phys., 17, 1381-1400, 2017 1394 X. Xu et al.: Sources and processes affecting atmospheric Hg at Kejimkujik National Park, Canada

Table 9 .
General statistics of speciated Hg with different data treatment options.

Table 10 .
Impact of combining or excluding GOM and PBM on PMF factor contributions (> 15 %) to Hg compounds.