Improving air quality model predictions of organic species using measurement-derived organic gaseous and particle emissions in a petrochemical-dominated region

This study assesses the impact of revised volatile organic compound (VOC) and organic aerosol (OA) emissions estimates in the GEM-MACH (Global Environmental Multiscale–Modelling Air Quality and CHemistry) chemical transport model (CTM) on air quality model predictions of organic species for the Athabasca oil sands (OS) region in Northern Alberta, Canada. The first emissions data set that was evaluated (base-case run) makes use of regulatoryreported VOC and particulate matter emissions data for the large oil sands mining facilities. The second emissions data set (sensitivity run) uses total facility emissions and speciation profiles derived from box-flight aircraft observations around specific facilities. Large increases in some VOC and OA emissions in the revised-emissions data set for four large oil sands mining facilities and decreases for others were found to improve the modeled VOC and OA concentration maxima in facility plumes, as shown with the 99th percentile statistic and illustrated by case studies. The results show that the VOC emission speciation profile from each oil sand facility is unique and different from standard petrochemicalrefinery emission speciation profiles used for other regions in North America. A significant increase in the correlation coefficient is reported for the long-chain alkane predictions against observations when using the revised emissions based on aircraft observations. For some facilities, larger longchain alkane emissions resulted in higher secondary organic aerosol (SOA) production, which improved OA predictions in those plumes. Overall, the use of the revised-emissions data resulted in an improvement of the model mean OA bias; however, a decrease in the OA correlation coefficient and a remaining negative bias suggests the need for further improvements to model OA emissions and formation processes. The weight of evidence suggests that the top-down emission estimation technique helps to better constrain the fugitive organic emissions in the oil sands region, which are a challenge to estimate given the size and complexity of the oil sands operations and the number of steps in the process chain from bitumen extraction to refined oil product. This work shows that the top-down emissions estimation technique may help to constrain bottom-up emission inventories in other industrial regions of the world with large sources of VOCs and OA.


Introduction
Chemical transport models (CTMs) are useful tools to support clean energy policy decisions because they can be used to assess the impact of past and future pollutant emission changes on air quality (e.g., Schultz et al., 2003;Kelly et al., 2012;Rouleau et al., 2013;Lelieveld et al., 2015).CTMs can also be run in forecast mode with their output being used to support air quality forecasts (Moran et al., 2013;Chai et al., 2013).CTMs require pollutant emission inputs, typically at hourly intervals, at the model grid spatial resolution (Dickson and Oliver, 1991;Houyoux et al., 2000;Pouliot et al., 2012Pouliot et al., , 2015;;Zhang et al., 2018).The pollutant emission input files are based on the processing of emission inventories compiled for all emission sectors, usually at some geopolitical spatial resolution (e.g., county, province or state, or country), and may thus require the application of spatial disaggregation factor fields to allocate emissions to the model grid.North American emission inventories are typically derived from bottom-up approaches, where representative pollutant emission factors (e.g., pollutant mass emission per volume of fuel burned) are multiplied by activity factors (e.g., volume of fuel burned per unit time).In developed countries, industrial facilities are usually required to report estimates of their pollutant emissions to national inventories such as the National Pollutant Release Inventory (NPRI) in Canada (Government of Canada, Canada Gazette, 2018) and the National Emissions Inventory (NEI) in the United States (Office of the Federal Register, Protection of Environment, 2015).Updates of these inventories occur under a regulatory framework on a regular basis.However, reporting requirements may be limited to aggregated mass emissions on an annual basis (e.g., a total bulk mass of volatile organic compound, VOC, emitted rather than a detailed and observation-based emissions of individual speciated VOCs), with the subsequent use of VOC speciation profiles (splitting factors) to determine the relative contribution of the individual VOCs to the total VOC emissions.Uncertainties in the availability and assignment of appropriate VOC speciation profiles, spatial and temporal allocation factors (Mashayekhi et al., 2016), and/or unaccountedfor emitting activities result in the need to evaluate the impact of these assumptions through the comparison of CTM predictions with ambient observations.The Athabasca region of northeastern Alberta, Canada, has one of the largest reserves of oil sands (OS) in the world.The OS deposits are composed of bitumen, minerals, sand and clay.Oil sand near the surface is mined by open-pit mining techniques.The oil sand is then transported by heavyhauler trucks to crushers, followed by the addition of hot water to make the oil sand flow through pipelines to a bitumen extraction facility.Here, the bitumen is separated from the sand and clay by the use of organic solvents.The product is used either directly, upgraded on-site to crude oil, or transported to a remote upgrader facility.Volatile organic compounds from the bitumen have the potential to escape into the atmosphere as fugitive emissions during the mining, extraction, processing, or tailing discharge steps.The complexity and vast size of the oil sands operations make generating pollutant emission input files for CTMs a challenge (Cho et al., 2012;ECCC & AEP, 2016).
Organic compounds in the atmosphere are oxidized over time and, in the presence of sufficient levels of oxides of nitrogen, are important precursors to ozone formation (Seinfeld and Pandis, 1998).VOCs and semi-volatile organic compounds (SVOCs) are also precursors to secondary organic aerosol (SOA) formation (Griffin et al., 1999;Kanakidou et al., 2005;Robinson et al., 2007;Kroll and Seinfeld, 2008;Slowik et al., 2010;Stroud et al., 2011;Gentner et al., 2017).If the organic compounds have sufficiently low saturation vapor pressures, then upon release into the atmosphere they remain particle bound and are classified as primary organic aerosol (POA).Many specific organic compounds can also be toxic to human health and require explicit reporting in emission inventories (Stroud et al., 2016).
The Joint Oil Sands Monitoring (JOSM) program was developed by the federal government of Canada and the Alberta provincial government with input and consultation from the local indigenous population and industry stakeholder groups to monitor the potential impacts of pollutant emissions.During JOSM, top-down approaches to estimate emissions based on atmospheric observations provided a unique opportunity to compare with bottom-up-calculated emissions for the Athabasca OS facilities in Alberta, Canada (Gordon et al., 2015;Li et al., 2017).The mass-balance approach that was used is based on using box-shaped aircraft flight patterns around a facility and measuring pollutant concentrations and meteorological variables (wind speed and direction, air density).In this approach, the difference in pollutant mass fluxes entering and leaving the box is used to determine the total facility-wide emission rate, subject to assumptions such as minimal losses due to chemical oxidation between the emissions location and the nearby aircraft observations.Environment and Climate Change Canada's (ECCC) chemical transport model, GEM-MACH (Global Environmental Multiscale-Modelling Air quality and CHemistry), is being used in JOSM to assess the impact of current emissions and future emission changes on local air quality and downwind regional-scale acid deposition (Makar et al., 2018).In this model study, we make use of both regulatory-inventorybased and aircraft-observation-derived emissions data for VOCs and primary particulate emissions for six large OS mining facilities as inputs to GEM-MACH in order to assess the impact of these two different emission data sets on model predictions of VOC concentrations and OA formation.

Methods
The GEM-MACH model uses the ECCC operational weather forecast model (Global Environmental Multiscale, GEM) as the core operator for dynamics and microphysical processes (Côté et al., 1998a, b;Girard et al., 2014).GEM-MACH is an "online" CTM -the chemistry, vertical diffusion, and pollutant deposition routines exist as a set of subroutines contained and called from within GEM's meteorological physics package (Moran et al., 2010;Makar et al., 2015a, b).The gasphase chemistry scheme is based on the ADOM-II mechanism (Acid Deposition and Oxidant Model, version II), originally developed for continental boundary-layer oxidant formation.The VOC lumped species used in GEM-MACH are described in Stroud et al. (2008).The focus here is on evaluating volatile aromatic and alkane species of anthropogenic Table 1.Facility total emission rates for three lumped organic species and PM 2.5 calculated with the bottom-up, base-case inventory, CEMA facility-specific VOC profiles (labeled "base case") and the top-down measurement-derived rates (labeled "revised-emission case", scaled to tonnes per year for VOCs or tonnes per August and September for PM 2.5 ).Emission rate increase/decrease of more than ±500 tonnes compared to base case is shown in bold/italic.M/S: Millennium and Steepbank; ML: Mildred Lake; M/J: Muskeg and Jackpine.VOC revised emissions are based on annual estimates, derived in Li et al. (2017).The estimates consider monthly and annual oil production yields reported by facilities for the plant stack emissions.For tailing ponds and mine faces, the VOC estimates are calculated using a surface-to-atmosphere mass transfer model considering ambient temperature and wind speed.* PM 2.5 revised emissions are based on 2-month emission (August and September) rather than based on an annual estimate (Zhang et al., 2018) due to uncertainties in calculating dust emissions in the winter months.
origin.The aerosol size distribution is described by a 12bin sectional approach based on the Canadian Aerosol Module (CAM) (Gong et al., 2003;Park et al., 2011).The SOA scheme is based on a two-product fit to smog chamber data using the SOA yield equations derived from gas-particle partitioning theory (Pankow 1994;Griffin et al., 1999;Barsanti et al., 2013).In the GEM-MACH model's current SOA formation algorithms, after initial particle formation, the organic compounds in the particle phase are assumed to be converted rapidly to nonvolatile mass, as observed by recent studies (Cappa and Jimenez, 2010;Cappa and Wilson, 2011;Lopez-Hilfiker et al., 2016) and recommended by modeling studies (Shrivastava et al., 2015).However, other recent observation studies suggest that SOA "chemical aging" over hours to days is quite complex and involves further gas-phase oxidation and fragmentation reactions (Jimenez et al., 2009;Donahue et al., 2014), as well as potential particle-phase oxidation and oligomer reactions (McNeill et al., 2015).The particle oligomer reactions are rapid, often acid catalyzed, and can result in conversion to nonvolatile mass (Liggio et al., 2005;Kroll et al., 2005).We discuss below the evidence from this work on the likelihood that these additional missing processes are still impacting our model organic aerosol bias.

Emissions
The Canadian base-case emissions were derived by combining several emission inventories, targeting 2013 as the base year.This base year was chosen to align with the JOSM 2013 intensive field study period, which provided the observations for the model-observation comparisons that follow.Canadian emissions for industrial facilities, including the Athabasca OS mining facilities, were obtained from the 2013 NPRI.The US base-case emissions were obtained from the 2011 US NEI version 1 (Eyth et al., 2013).
These base-case, bottom-up emissions inventories were processed with the SMOKE (Sparse Matrix Operator Kernel Emissions) emissions processing tool (https://www.cmascenter.org/smoke,last access: 26 January 2018), which includes three major steps corresponding to spatial allocation, temporal allocation, and chemical speciation (for NO x , VOC, and PM).The base-case VOC speciation profiles used by SMOKE for the OS surface mining facilities were obtained from the CEMA (Cumulative Environmental Management Association) inventory (Davies et al., 2012;Zhang et al., 2015).
For the sensitivity run, speciated VOC emissions from the base case for four OS mining facilities (Suncor Millennium and Steepbank, Syncrude Mildred Lake, Shell Canada Muskeg and Jackpine, and CNRL Horizon) were revised by replacing them with the top-down emission rates estimated by Li et al. (2017) while primary PM emissions were revised for six oil sand facilities (Suncor Millennium and Steepbank, Syncrude Mildred Lake, Shell Canada Muskeg and Jackpine, CNRL Horizon, Syncrude Aurora North, and Imperial Oil Kearl) (Zhang et al., 2018).The VOC and PM chemical speciation profiles used for these facilities were also revised using the aircraft-observed VOC speciation (Li et al., 2017) and ground-based PM filter analysis (Wang et al., 2015), respectively.The set of emissions input files making use of these revisions is hereafter referred to as the "revised emissions", while the original emissions input files without these changes are referred to as the "base-case emissions".A detailed description of the development of the emission inventory and emissions processing steps to create the model-ready files (hourly gridded emission fields for the same domain and grid spacing as the model) for the base case and revised version are described in Zhang et al. (2018).Table 1 compares the facility emission rates for four species for the base case and revised-emissions case.The changes are not consistent from species to species and are not uniform across facilities.Interestingly, the facilities that use paraffinic solvents for bitumen extraction (e.g., Shell Muskeg and Jackpine) were associated with the largest ALKA emission (long-chain alkanes) increases and aromatic decreases.The Supplement section includes figures illustrating the emission difference maps for the oil sand region (absolute and relative difference) showing the spatial distribution of emission changes between revised and base case.The changes are largest over the surface mines and tailing ponds.
Depending on whether bitumen extracted from the oil sand is upgraded on-site or not, the OS mining facilities can be classified into two broad types: 1. integrated extraction and upgrading facilities (Suncor Millennium and Steepbank, Syncrude Mildred Lake, and CNRL Horizon) and 2. extraction-only facilities (Shell Canada Muskeg and Jackpine, Syncrude Aurora North, and Imperial Oil Kearl).
Table 2 shows a comparison of the CEMA plant-specific VOC speciation profiles used in the base case for the two types of OS plants compared to two standard VOC speciation profiles for petrochemical facilities (no.9012 "Petroleum Industry -Average", no.0316 "Fugitive Emissions, Pipe/Valve Flanges") that were used by SMOKE to speciate more than half of the refinery emissions in the Houston area, the largest petrochemical cluster in the US.There are significant differences between the base-case OS plant VOC speciation profiles and the two commonly used standard oil refinery profiles.The OS integrated extraction and upgrading plant profiles are higher in long-chain alkenes (ALKE), toluene, and other aromatics than the standard profiles, while the extraction-only plant profile has the highest long-chain alkane fraction.The two standard profiles used for the base case and revised simulation (for speciating US and Canadian refinery emissions) have higher less-reactive species (e.g., propane, acetylene) and formaldehyde (profile no.9012) than both of the CEMA OS plant profiles.Note also that these differences in relative fractions result in substantial differences in the absolute emissions of certain groups of VOCs between the standard profiles for oil refineries and the facility-specific oil sand profiles.For reference, the aircraft-measurementderived facility-specific VOC speciation profiles used for four OS facilities in the revised-emissions case are presented in Zhang et al. (2018).The aircraft-measurement-derived profiles in Zhang et al. (2018), and used here for the revised case, are composite profiles since they encompass plant, tailing pond, and mining emissions.As such, they are not appro-priate for comparison with the profiles in Table 2, which are specific to plant emissions.The primary PM emissions from the OS facilities originate largely from off-road heavy-duty diesel trucks, plant stack emissions, and fugitive and wind-blown dust.The 2009/2010 CEMA inventory was used to specify the tail-pipe emissions from the off-road mining fleet and the 2013 NPRI inventory was used for fugitive road-dust emissions.The base-case inventory did not include wind-blown dust.For the revised inventory, the PM size distribution was measured during the 2013 field study for all six facilities and these data were used to constrain the revised PM emission input data set.Note that the PM emissions estimates based on the aircraft-measured aerosol data included the contribution of wind-blown dust emissions.The aircraft-based PM emissions were re-binned for the 12 GEM-MACH PM size bins.The first eight size bins correspond to mass up to diameter 2.56 µm.Interestingly, the aircraft measured a much higher fraction of particulate mass in bin 8 (bounded by diameters 1.28 and 2.56 µm) compared to the mass fraction in bin 8 from the area-source PM size-distribution profiles used by SMOKE in processing the base-case emissions.In addition, a PM chemical speciation profile specific to OS fugitive dust emissions was created from an analysis of deposited dust collected from surfaces in the OS region (Wang et al., 2015); this speciation profile replaced the standard fugitive dust profile for unpaved roads from the US EPA (Environmental Protection Agency) SPECIATE v4.3 database in the revised-emissions processing.The resulting organic carbon fraction in the observationderived PM speciation profile was higher than that of the base-case emissions by about a factor of 3 (Zhang et al., 2018).In general, significantly higher POA emissions were observed over the open-pit mines for all facilities, except for the Imperial Kearl mine.The impact of the revised POA emissions will be discussed further in Sect.3.4.

Modeling
The GEM-MACH model was run in a nested configuration with an outer domain covering the continental US and Canada and an inner domain covering Alberta and Saskatchewan.The continental-scale GEM-MACH model (10 km resolution) and the Canada-wide GEM weather model (2.5 km resolution) were run first.These provided the chemical and meteorological lateral boundary conditions, respectively, for the high-resolution GEM-MACH 2.5 km resolution run, which has a domain covering the provinces of Alberta and Saskatchewan (Fig. 1).The two models providing boundary conditions were run on a 30 h cycle, of which the first 6 h were spun up and discarded, while the remaining 24 h provided boundary conditions for the 2.5 km GEM-MACH simulation.The initial conditions subsequent to the starting model simulation for each overlapping 24 h 2.5 km GEM-MACH simulation came from the end of the previous 2.5 km GEM-MACH simulation.This strategy was used to  allow the two boundary condition simulations to make use of assimilated meteorological analyses.The sequence of model simulations was started for 10 August 2013 and run until 7 September to cover the 2013 JOSM intensive field study period.

Observations
The NRC (National Research Council) Convair two-engine turboprop aircraft was used to collect air quality observations during the JOSM 2013 intensive field study.The aircraft was equipped with a suite of instruments to measure air quality over 22 flights (see Li et al., 2017, Supplement Fig. S1).
Most of the flight hours focused on box-flight paths; these took the aircraft around the periphery of facilities at different heights, with the goal of deriving facility-wide emission rates by using observations of chemical concentrations and winds to estimate the mass of pollutants entering and leaving the box enclosures.Coupled with a mass-conserving flux model (Gordon et al., 2015), these aircraft data were used to estimate emissions from the encircled facilities.VOC and PM observations were collected by the instrumented research aircraft using different technologies.A proton-transfer-reaction mass spectrometer (PTR-MS) was used to measure a select number of VOCs at high temporal resolution (1 s) (Li et al., 2017).An aerosol mass spectrometer (AMS) was used to measure PM 1 mass and nonrefractory chemical composition (Liggio et al., 2016).A single particle soot photometer (SP2) was used to measure refractory black carbon aerosol (Liggio et al., 2016).A number of canisters were filled with ambient air on each flight and returned to the lab for GC-FID (gas chromatograph with flame ionization detector) and GC-MS (gas chromatograph with mass spectrometer) analysis of VOCs (Li et al., 2017).The canister VOC analysis measured 154 different C 2 to C 12 hydrocarbons (Dann and Wang, 1995).The resulting observation data were compared to the model output generated as described above.The 2.5 km GEM-MACH runs used a 120 s chemistry time step; 120 s model output values were linearly interpolated in time and space to the aircraft observation locations; all comparisons which follow make use of the resulting model-observation data pairs for the two simulations.

Toluene and other mono-substituted aromatics (TOLU) evaluation
The aircraft PTR-MS measurement data set was averaged to 10 s intervals for comparison to the GEM-MACH model  using 20 mixing-ratio bins and an increment of 0.2 ppbv per bin.It is clear that there are more high values (> 2 ppbv) produced by the sensitivity model run with revised emissions compared to the base-case model run.The number of observations in the highest value bins lies between the results from the revised and base-case versions.This can be quantified by using the 99 % percentile statistic (obs = 1.258 ppbv, revised = 1.906 ppbv, base = 0.934 ppbv).The 99 % percentile means that 99 % of the data points are lower than the value.The median concentration of the observations (0.061 ppbv) is higher than both the revised (0.038 ppbv) and base-case model (0.019 ppbv) simulated values, but is closer to the revised version.Table 3 lists statistical scores for the TOLU lumped species and the other species considered in this study.
The mean bias goes from a negative value with the base-case run to a positive value with the revised emissions.There is little difference in the correlation coefficient for the model vs. observation scatterplot between the base-case and sensitivity run.The changes to the VOC emissions for the revisedemissions run affected their total mass and speciation, and the observations were made sufficiently close to the sources that there was little time for oxidation.The main sources for VOCs are the processing plants, tailing ponds, mine faces, and off-road vehicles, and their spatial allocation (from CEMA, 2010) did not change significantly between the two model-emission versions.The main differences in the model time series between the two simulations are thus in magnitude of concentrations, and hence relatively invariant correlation coefficients might be expected.The correlation coefficient is more likely controlled by the meteorological model accuracy in the placement of the plumes (i.e., wind direction).
The largest increases in the TOLU emission, between the revised and base-case run, are noted for the Syncrude Mildred Lake facility over the tailing ponds and open-pit mine faces.Table 1 shows the changes on a facility-wide level.Notable increases are also calculated for the Suncor Millennium and Steepbank and the Canadian Natural Resources Ltd. (CNRL) Horizon facilities.The flights on 14 and 23 August have the largest TOLU mixing ratios for the aircraft study, and both flights correspond to box flights around the Syncrude Mildred Lake facility.The SI section includes the model and measurement time series comparisons (termed case studies) for the flights on 14 August (Supplement Fig. S5) and 23 August (Supplement Fig. S6).Overall, the magnitude of the mixing ratio maximum in the time series is better represented in the revised-emission simulation.This is also reflected in the better slope statistic in Table 3 for the revised-emission simulation.

Multi-substituted aromatics (AROM) evaluation
The model lumped AROM species includes all multisubstituted aromatics, with the most important species being the xylene isomers and trimethylbenzene isomers.These two species match with the PTR-MS C8 and C9 aromatic fragments, respectively.However, the observed C8 aromatic also includes ethylbenzene and the C9 aromatic also includes propylbenzene, which are lumped with TOLU in the model VOC speciation.Thus, we need to subtract these unwanted species from the totals used to compare to the model lumped AROM species.To do this, we use their correlation slopes with PTR-MS C7 aromatic from Sect.3.1.The new observation-derived AROM was calculated from the PTR-MS measurements as follows: C8 + C9 − 0.376 C7 − 0.0652 C7.
Figure 3 shows the histograms for the lumped AROM species for 10 s averaged points along all the flight tracks.The base model has a large number of high value points (> 2 ppbv), many more than the model simulations with the revised emissions, and also more than the observations.This can be quantified by using the 99 % percentile (obs = 0.7607, revised = 1.004, base case = 2.302).The median value for the observations is 0.0182 ppbv, smaller than both of the model versions (revised = 0.0236 ppbv, base  3 lists other statistical scores for the AROM lumped species.The mean bias and RMSE (root mean square error) are smaller for the revised-emissions run compared to the base case.However, there is a small degradation in the correlation coefficient with the sensitivity run.
The largest decreases in the AROM emission field between the revised and base-case emissions are again over the Syncrude Mildred Lake facility (refer to Table 1).There were also notable decreases over the CNRL Horizon and Shell Muskeg and Jackpine facilities, but positive changes in AROM emissions were noted over the Suncor Millennium and Steepbank facility (also refer to Supplement Fig. S2 for the emission spatial difference map).The Supplement section includes the model and measurement time series comparison for the flights on 23 August and 3 September.In general, the observed mixing ratio changes are closer in magnitude to the predictions from the revised-emission simulation compared to the base case for the plume intersects.

Long-chain alkanes (ALKA) evaluation
The long-chain alkanes (C 4 to C 12 ) were sampled by filling canisters with ambient air on-board the aircraft.Fig- ure 4 presents the histogram for the long-chain alkanes.The mixing ratios are divided into 20 bins each with a width of 3 ppbv.From the observed histogram, there is a wide range to the mixing ratios with a small number of very large concentrations, but also the first bin (0 to 3 ppbv) has a high percentage of the points.The model gas-phase mechanism represents all higher-carbon-number alkanes by a single lumped species, with chemical and physical properties derived from C 4 to C 8 alkanes.The base-case run calculates lower ALKA mixing ratios than the model version using revised emissions.The model using revised emissions is much better at reproducing the higher concentration points, particularly above 12 ppbv.This is quantified by the 99 % percentile of the data sets (obs = 29.9,base = 18.0, revised = 24.6).Other statistics for the lumped ALKA species are shown in Table 3.The mean bias went from a small negative value to +1.98 ppbv.The slope decreased by a small value, but the y intercept increased, which also increased the RMSE for the run with the revised emissions.The correlation coefficient improved significantly for the model run with revised emissions.
The revised ALKA emissions are considerably higher for the CNRL Horizon and Shell Muskeg and Jackpine facilities, but have smaller changes for the other facilities (refer to Ta-ble 1), possibly reflecting differences in the processing activities between the facilities.Overall, the time series analysis for the aircraft flights (refer to Supplement Fig. S10 and related discussion in the Supplement) showed mixed improvements for ALKA associated with the revised emissions.The large increases in ALKA emissions in the sensitivity simulation for the CNRL facility did improve the model maxima for the plume intersects on 26 August.The analysis suggests further improvement in spatial allocation for the Shell facility may be needed.The higher ALKA mixing ratios also feeds back to higher SOA formation downwind of these facilities, as discussed below.
The use of aircraft observations to both derive emissions data and evaluate the subsequent model simulations might be taken as circular reasoning.We note first that observationderived emissions are frequently used in modeling (for example, continuous emissions monitoring system concentration observations are used to generate emissions data for large stack emitters), and, second, that the emissions are only one component of the overall modeling system.An improvement in the simulated VOC concentrations using observation-based emissions is only guaranteed if the emissions dominate the net model error.While our results show that, in general, the new emissions information does improve model performance, the results using those new data are not perfect, indicating other sources of error are contributing to the overall model performance.

Organic aerosol (OA) evaluation
Figure 5 illustrates the histograms for the organic aerosol observations and model results with base-case and revised emissions.A clear improvement is shown in the highest concentration bins (> 15 µg m −3 ) with the revised emissions.This can be quantified with the 99th percentile of the data (obs = 13.4 µg m −3 , revised = 9.3 µg m −3 , base = 4.9 µg m −3 ).The median statistic also improved (obs = 2.8 µg m −3 , revised = 0.84 µg m −3 , base = 0.70 µg m −3 ).The lower 5th percentile is also significantly under-predicted compared to observations and does not change much between the two model runs (obs = 0.49 µg m −3 , revised = 0.036 µg m −3 , base = 0.035 µg m −3 ).This reflects an under-prediction in the background OA predicted by the model, which is likely due to low biogenic SOA formation and aging in both model versions.The importance of widespread biogenic SOA formation from boreal forests has been reported in other work (Slowik et al., 2010;Tunved et al., 2006).
Additional statistics are presented in Table 3.The mean bias, RMSE, and slope all improve for the revised-emissions run, though the correlation coefficient decreases significantly for this run.To investigate the variability in the OA bias, we plotted the OA bias as a function of different measured variables.Figure 6 is a plot of the OA bias as a function of the observed black carbon (BC) aerosol for the base-case and sensitivity runs.The BC is a marker for petrochemical combustion, particularly diesel.For the base-case run, the OA negative bias is observed to increase in magnitude with observed BC.Points with high observed BC correlate well with emissions from the OS open-pit mines (Liggio et al., 2017), where the BC is likely emitted from the heavyhauler trucks.The locations with the largest OA bias were also consistent with the locations of mines and the transport wind direction.A review of the OS emission inventories suggests that about 70 % of the BC comes from the OS off-road diesel fleet.Including all points, the mean bias improves from −2.8 to −2.4 (see Table 3) when using the revised emissions.Figure 6b shows a zoomed plot for points with high observed BC (> 0.8 µg m −3 ).There is a clear improvement in bias for most of these points.The average bias for these high BC points improves from −6.8 µg m −3 for the base-case to −2.6 µg m −3 for the revised emissions.For emissions processing the increase in PM emissions was assigned to the processing plants (particle bin diameter D < 1 µm) or the surface mines (particle bin diameter D < 1 µm).Overall, Fig. 6 shows that, while the negative OA bias improves for samples high in BC concentration (i.e., influenced by petrochemical combus- tion or collocated with petrochemical combustion sources), there still remains an unaccounted-for negative OA bias.
Figure 7 is a scatterplot of the difference in predicted POA between the revised and base-case emissions runs vs. the difference in predicted total OA.A large fraction of the points fall along the 1 : 1 line, and hence for these points the difference between the two runs is almost completely due to the increased total primary PM emissions, and increased POA fraction of those emissions, of the revised-emissions simulations.The points with largest concentrations along the 1 : 1 line correspond to flights over the Syncrude Mildred Lake facility on 16 and 23 August and on 3 September.There is a subset of points, however, that lies below the 1 : 1 line; these correspond to points with significantly enhanced model SOA between the two runs (16 August flight over CNRL Horizon and 21 August survey flight over Shell Muskeg and Jackpine).The Supplement section includes the model and measurement time series comparisons for the flights on 21 and 23 August and 3 September.Overall, the case studies showed improved predictions for the magnitude of the organic aerosol change for the plume passages with the revised emissions; however, the base line organic aerosol was overpredicted for all case studies.

Organic aerosol model recommendations
The improvement in model PM 1 OA bias due to the use of the revised emissions is encouraging; however, the decrease in correlation coefficient suggests that the spatial allocation of PM 1 emissions may need further refinement.The remaining negative bias suggests that other important processes may be missing or underrepresented in the model.Three recommendations emerge from recent and current work: 1. SOA formation from fugitive IVOC emissions Recent publications suggest that fugitive intermediate volatile organic compound (IVOC) emissions from the OS open-pit mines are needed to represent SOA formation downwind of the OS region (Liggio et al., 2016).In our emissions revision, only a small portion of the IVOCs (dodecane C 12 ) were added and lumped into the long-chain ALKA lumped species.IVOC species with carbon number ≥ 13 were not measured by the Li et al. (2017) aircraft study and thus we do not have revised IVOC emissions included in this work.Furthermore, the ALKA lumped species has an SOA yield more representative of a lower-molecular-weight range, and the yield is known to increase with increasing carbon number, so the dodecane SOA contribution would be underestimated.Work is currently underway with GEM-MACH to implement a volatility basis set (VBS) approach to SOA formation.The VBS approach will more adequately represent the intermediate and semi-volatile volatility range and chemical aging of these lowervolatility compounds (Robinson et al., 2007).Future work will measure IVOC emissions using box flights around the oil sand facilities and open-pit mines.This will remove current uncertainties in models and help improve the negative bias in plumes.Implementing the VBS scheme will also enable the PM emissions used here (in both data sets) to be distributed into volatility bins.
Also, while the measurement-derived emissions are missing the IVOCs, the measurement-derived POA emissions may contain some gaseous VOC, IVOC, and SVOC species that react quickly and in one oxidation step yield products that condense onto particles.This rapid SOA mass produced would be measured in the box flights and, at least partially, accounted for in the updated OA emissions; however, this is labeled here as POA instead of fresh SOA.Furthermore, there is the potential for double counting if some of the very reactive gaseous precursors react to form SOA and this is accounted for in the measured POA.In this paper, we have tried to minimize this effect by examining the model performance in the "near field" from emission flights close to facilities.This will be the topic of future box modeling work with the new 2018 measurementderived IVOC and SVOC emissions to determine how much of the measurement-derived POA is derived from the fugitive open-pit mining IVOC and SVOC emissions and their rapid particle formation.

Background organic aerosol levels
The under-prediction in background OA was a general finding from the study; the cause is believed to be due to underestimated biogenic SOA, due to the lumping of biogenic monoterpene emissions into the anthropogenic ALKE model species in the model's gas-phase mechanism and the lack of speciated representation of other biogenic SOA precursors such as sesquiterpenes.Future work will update the biogenic SOA yield coefficients in the VBS approach using recent smog chamber results which account for gas-phase loss of organic species to chamber walls (Ma et al., 2017).

Spatial allocation of emissions
Future field studies should also focus on improving within-facility spatial allocation.For example, withinfacility data such as the GPS (Global Positioning Sys-tem) location of the mining trucks would be helpful to derive their activity diurnal profiles and to improve truck emission spatial allocation within a facility.The GPS data would also be useful to define the location of freshly excavated open-pit mines within a facility.

Conclusions
Overall, the weight of evidence suggests that the top-down emission estimation technique applied to the OS surface mining facilities helps to better constrain reported facility total organic emissions including fugitive sources, as shown by improved model results when the revised emissions are employed.We note that emissions from these sources are a challenge to calculate in bottom-up inventories due to the potential for fugitive emissions.For the mono-and multi-substituted aromatics (TOLU and AROM), the measured emission rates were more finely adjusted compared to the base emissions, as TOLU facility totals went up, AROM totals went down and total aromatic emissions (TOLU + AROM) were revised by only a small extent.The negative bias compared to observations for TOLU became a small positive bias and the large positive bias for AROM became only a small positive bias.The model's ability to predict very high aromatic concentrations in plumes improved with the revised emissions, as shown by the 99th percentile statistic and the case studies.
For the long-chain ALKA species, the revised emissions may have overcorrected, on average, as shown by the increase in mean bias for the entire aircraft data set.However, the correlation coefficient did improve significantly for the long-chain alkane predictions, suggesting the combination of alkane emission increases for some facilities and decreases for others helped to improve the spatial distribution of ALKA emissions.The results for some facilities suggest that further improvement could be achieved by putting more emissions at extraction processing plant locations (i.e., adjusting withinfacility spatial allocation).Interestingly, the alkane emission increases and aromatic emission decreases, derived from aircraft data (Li et al., 2018), were associated with the facilities that use paraffinic solvents for bitumen extraction (e.g., Shell Muskeg and Jackpine).Overall, the predictions of alkanes in high concentration plumes improved with the revisedemission data set, as shown by the 99th percentile statistic.
For PM 1 organic aerosol, the revised emissions improved the mean bias for predictions; however, a negative bias still exists and the improvement was associated with a decrease in correlation coefficient.The increase in predicted PM 1 OA concentration was largely due to the increase in POA emissions in the revised-emissions input files.The POA emissions increased because of a combination of larger measurementderived PM 1 emissions and the revised ground-observed PM speciation profile having a larger POA fraction.The increases in PM 1 POA emissions were largely allocated spa-C.A. Stroud et al.: Improving air quality model predictions of organic species tially to stack locations and this allocation may be a key factor in the degradation of the correlation coefficient, especially if the fine OA originates from mine-face fugitive emissions.Future work should focus on improving within-facility spatial allocation of emissions.The remaining negative bias in plumes likely stems from missing IVOC emissions in both of the emission data sets used here, as suggested by Liggio et al. (2015).Ongoing field work to measure the IVOC emissions using aircraft box flights is underway in a new 2018 measurement intensive field study.Upcoming modeling work with GEM-MACH will include the VBS approach to better represent lower-volatility compounds.
Data availability.The model results are available upon request to Craig Stroud (craig.stroud@canada.ca).GEM-MACH, the atmospheric chemistry library for the GEM numerical atmospheric model (©2007-2013, Air Quality Research Division and National Prediction Operations Division, Environment and Climate Change Canada), is a free software which can be redistributed and/or modified under the terms of the GNU Lesser General Public License as published by the Free Software Foundation -either version 2.1 of the license or any later version.The specific GEM-MACH version used in this work may be obtained on request to craig.stroud@canada.ca.Many of the emissions data used in our model are available online at ECCC (2018a, b) and more recent updates may be obtained by contacting Junhua Zhang or Mike Moran (junhua.zhang@canada.ca;mike.moran@canada.ca).The aircraft observations used in this study are publicly available on the ECCC data portal (ECCC, 2018c).
Author contributions.CAS developed the study concept and design, performed the analysis of model output and measurement data, and wrote and modified the manuscript.PAM developed the GEM-MACH code and was the overall coordinator for the oil sand model studies.JZ created the emissions files used in the model.MDM led on model emission science and development.AA performed the GEM-MACH simulations and was a developer of the maestro suite.SML was the lead scientist coordinating the aircraft studies.AL performed the PTR-MS VOC measurements on the aircraft.KH was the on-board flight director and principal scientist in charge of the AMS instrument.MS was the principal scientist for the VOC canister measurements.
Competing interests.The authors declare that they have no conflict of interest.Special issue statement.This article is part of the special issue "Atmospheric emissions from oil sands development and their transport, transformation and deposition (ACP/AMT inter-journal SI)".

Figure 1 .
Figure1.The background image is the nested domain, at 2.5 km grid spacing, covering all of Alberta and Saskatchewan and encompassing the Athabasca oil sands study region (white box).The model field shown is for the lumped toluene species (TOLU) mass mixing ratio (µg kg −1 air).The inserted image on the right is the TOLU emission map (g s −1 grid cell) for the oil sands study region at the same hour as mixing ratio image on the left.The oil sands facilities' names are listed in white labels.

Figure 2 .
Figure 2. Histograms for (a) observed TOLU, (b) revisedemissions TOLU, and (c) base-case-emissions TOLU volume mixing ratios (ppbv).Points correspond to 10 s averaged aircraft and model data, sorted into 20 bins by volume mixing ratio.The inset boxes show the 50th and 99th percentile values for each histogram.

Figure 3 .
Figure 3. Histograms for (a) observed AROM, (b) revisedemissions AROM, and (c) base model AROM volume mixing ratios (ppbv).Points correspond to 10 s averaged aircraft and model data, sorted into 20 bins by volume mixing ratio.The inset boxes show the 50th and 99th percentile values for each histogram.

Figure 4 .
Figure 4. Histograms for (a) observed ALKA, (b) revisedemissions ALKA, and (c) base-case emissions ALKA volume mixing ratios (ppbv).Points correspond to canister grab samples and model data, sorted into 20 bins by mixing ratio.The inset boxes show the 99th percentile value for each histogram.

Figure 5 .
Figure 5. Histograms for (a) observed organic aerosol (OA), (b) revised-emissions OA, and (c) base-case emissions OA concentrations (µg m −3 ).Points correspond to 10 s averaged aircraft and model data.The inset boxes show the 50th and 99th percentile values for each histogram.

Figure 6 .
Figure 6.(a), (b) Organic aerosol model bias as a function of observed black carbon aerosol.The bottom panel is an enlargement of the upper panel showing only the data points for observed BC > 0.8 µg m −3 .The model results for the base-case emissions run are plotted in blue and points in red correspond to the revisedemissions run.The data plotted are for all the aircraft flights.

Figure 7 .
Figure 7. Difference in predicted POA concentrations between revised-emissions and base-case runs plotted as a function of the difference in predicted total OA concentration between the revisedemissions and base-case runs for all flights.Points along the 1 : 1 line show a difference solely from POA emission changes.Points below the 1 : 1 line show enhanced SOA formation.

Table 2 .
Zhang et al. (2018)OC speciation profiles (mass fractions) applied to the surface mining facilities in the Athabasca oil sands region compared to standard speciation profiles for Canadian and US petrochemical oil refineries (in ADOM-II chemical speciation).Data are based onZhang et al. (2018)and references therein.All four profiles are used in the base-case simulation.M/J: Muskeg and Jackpine; AN: Aurora North; ML: Mildred Lake; M/S: Millennium and Steepbank; CEPS: Canadian Emissions Processing.
(Moran et al., 1997)p to unity due to unaccounted-for or unassigned species and/or due to consideration of reactivity weighting for the ADOM-II mechanism.Refinery profile no.9012 is a profile from the Canadian Emissions Processing System(Moran et al., 1997).

Table 3 .
Statistical scores from the model simulations with revised and base-case emissions; all statistics are relative to observations.is the root mean square error.Y intercept corresponds to the model intercept of a model vs. observation correlation plot.Mean bias is the model-observation mean score.The better score for a given pair of statistics is shown in bold font. RMSE