Organic aerosol source apportionment by offline-AMS over a full year in Marseille

We investigated the seasonal trends of OA sources affecting the air quality of Marseille (France), which is the largest harbor of the Mediterranean Sea. This was achieved by measurements of nebulized filter extracts using an aerosol mass spectrometer (offline-AMS). In total 216 PM 2.5 (particulate matter with an aerodynamic diameter < 2.5 µm) filter samples were collected over 1 year from August 2011 to July 2012. These filters were used to create 54 composite samples which were analyzed by offline-AMS. The same samples were also analyzed for major water-soluble ions, metals, elemental and organic carbon (EC / OC), and organic markers, including n-alkanes, hopanes, polycyclic aromatic hydrocarbons (PAHs), lignin and cellulose pyrolysis products , and nitrocatechols. The application of positive matrix factorization (PMF) to the water-soluble AMS spectra enabled the extraction of five factors, related to hydrocarbon-like OA (HOA), cooking OA (COA), biomass burning OA (BBOA), oxygenated OA (OOA), and an industry-related OA (INDOA). Seasonal trends and relative contributions of OA sources were compared with the source apportionment of OA spectra collected from the AMS field deployment at the same station but in different years and for shorter monitoring periods (February 2011 and July 2008). Online-and offline-AMS source apportionment revealed comparable seasonal contribution of the different OA sources. Results revealed that BBOA was the dominant source during winter, representing on average 48 % of the OA, while during summer the main OA component was OOA (63 % of OA mass on average). HOA related to traffic emissions contributed on a yearly average 17 % to the OA mass, while COA was a minor source contributing 4 %. The contribution of INDOA was enhanced during winter (17 % during winter and 11 % during summer), consistent with an increased contribution from light alkanes, light PAHs (fluoranthene, pyrene, phenanthrene), and selenium, which is commonly considered as a unique coal combustion and coke production marker. Online-and offline-AMS source apportionments revealed evolving lev-oglucosan : BBOA ratios, which were higher during late autumn and March. A similar seasonality was observed in the ratios of cellulose combustion markers to lignin combustion markers, highlighting the contribution from cellulose-rich biomass combustion, possibly related to agricultural activities .

Abstract. We investigated the seasonal trends of OA sources affecting the air quality of Marseille (France), which is the largest harbor of the Mediterranean Sea. This was achieved by measurements of nebulized filter extracts using an aerosol mass spectrometer (offline-AMS). In total 216 PM 2.5 (particulate matter with an aerodynamic diameter < 2.5 µm) filter samples were collected over 1 year from August 2011 to July 2012. These filters were used to create 54 composite samples which were analyzed by offline-AMS. The same samples were also analyzed for major water-soluble ions, metals, elemental and organic carbon (EC / OC), and organic markers, including n-alkanes, hopanes, polycyclic aromatic hydrocarbons (PAHs), lignin and cellulose pyrolysis products, and nitrocatechols. The application of positive matrix factorization (PMF) to the water-soluble AMS spectra enabled the extraction of five factors, related to hydrocarbonlike OA (HOA), cooking OA (COA), biomass burning OA (BBOA), oxygenated OA (OOA), and an industry-related OA (INDOA). Seasonal trends and relative contributions of OA sources were compared with the source apportionment of OA spectra collected from the AMS field deployment at the same station but in different years and for shorter monitoring periods (February 2011 andJuly 2008). Online-and offline-AMS source apportionment revealed comparable seasonal contribution of the different OA sources. Results revealed that BBOA was the dominant source during winter, representing on average 48 % of the OA, while during summer the main OA component was OOA (63 % of OA mass on average). HOA related to traffic emissions contributed on a yearly average 17 % to the OA mass, while COA was a minor source contributing 4 %. The contribution of INDOA was enhanced during winter (17 % during winter and 11 % during summer), consistent with an increased contribution from light alkanes, light PAHs (fluoranthene, pyrene, phenanthrene), and selenium, which is commonly considered as a unique coal combustion and coke production marker. Online-and offline-AMS source apportionments revealed evolving levoglucosan : BBOA ratios, which were higher during late autumn and March. A similar seasonality was observed in the ratios of cellulose combustion markers to lignin combustion markers, highlighting the contribution from celluloserich biomass combustion, possibly related to agricultural activities.
Published by Copernicus Publications on behalf of the European Geosciences Union.

Introduction
Outdoor particulate air pollution is estimated to be responsible for approximately 3.3 million premature deaths each year worldwide, and this number is projected to double by 2050 (Lelieveld et al., 2015). Organic aerosols (OA) can contribute up to 90 % of the PM 1 , and therefore understanding their main emission sources and formation processes is a key prerequisite for the development of appropriate mitigation policies.
In the Mediterranean basin, sources and trends of OA remain scarcely investigated, despite their deleterious impact in such a densely populated region. The Mediterranean region is characterized by an intense photochemistry during summer. Not surprisingly, the majority of the OA source apportionment studies conducted in the region using aerosol mass spectrometry (AMS, Canagaratna et al., 2007) focused on the summer period (e.g., El Minguillón et al., 2011Minguillón et al., , 2016Hildebrandt et al., 2011). Through positive matrix factorization (PMF) techniques, these studies revealed that during summer the oxygenated organic aerosol (OOA) fraction formed by oxidation of gaseous precursors, represented the largest part of OA. Amongst these studies, the field deployment of the AMS in Marseille, the largest port in the Mediterranean, has demonstrated that this instrument is well suited for quantifying the contribution of industrial emissions . In that work, the industrial OA factor was identified by the high correlation with heavy metals and AMS polycyclic aromatic hydrocarbons (AMS-PAHs); moreover strong increments of the industrial factor concentrations were systematically observed when winds shifted to the west-southwest, consistent with back-trajectory analysis highlighting the transport of industrial emissions from an industrial pole. Overall the industryrelated OA contributed on average 7 % of the bulk OA mass (El Haddad et al., 2011. However, these results were limited to 2 weeks of measurements during summer while the contribution of industrial emissions during the rest of the year remains unknown. There is a general paucity of AMS and aerosol chemical speciation monitor (ACSM) datasets in the Mediterranean region during winter. Exceptions include AMS campaigns Hildebrandt et al., 2011) covering a few weeks during late winter-early spring and studies with an ACSM (e.g., Minguillón et al., 2015). The measurement of organic markers and elements (e.g., Salameh et al., 2015;Reche et al., 2012) at different stations indicate a substantial contribution from biomass burning (BB). However, the sources and chemical composition of this fraction and its evolution during the year remain uncertain. Modeling results within the European Monitoring and Evaluation Programme (EMEP) have shown that the south of France, together with Portugal, can be a major hotspot in Europe for OA during February-March, possibly due to agricultural fires (Denier van der Gon et al., 2015;Fountoukis et al., 2014). In this re-gion, biomass burning OA (BBOA) can derive from various processes such as agricultural land clearing activities, wildfires, and domestic heating and therefore may have a variable chemical composition.
The current study capitalizes on the AMS measurements of offline samples collected over 1 year (2011)(2012) in Marseille, an ideal environment for the characterization of urban emissions from biomass burning, traffic, and industrial activities and their transformation under high photochemical activity. The source apportionment results obtained from PMF applied to the OA mass spectra are corroborated using a comprehensive set of offline measurements including elemental and organic carbon (EC / OC) measurements, as well as measurements of elements by inductively coupled plasma mass spectrometry (ICP-MS), of molecular markers by gas chromatography mass spectrometry (GC-MS) and ultra-performance liquid chromatography mass spectrometry (UPLC-MS), and of major ions by ion chromatography (IC). We mainly focus on the sources and trends of winter OA and therefore we additionally analyzed an online-AMS dataset acquired at the same location during the winter of the previous year. The comparison of online-and offline-AMS data and organic marker concentrations enables an in-depth characterization of OA sources in Marseille and in particular the identification of the main processes by which biomass smoke is emitted and transformed in this region.

Site description
Marseille is the second largest city in France with more than 1 million inhabitants (2010). It hosts the largest harbor in France and in the Mediterranean Sea. Many port-related industries, especially petrochemical companies, are located in a big cluster. These facilities are situated about 40 km NW from the city and include steel facilities, coke production plants, oil storing, refining plants, and several shipyards. The Marseille commercial harbor is located in the vicinity of this industrial cluster and represents the third-largest harbor of the world for crude oil storage and treatment. During summer, typical wind patterns in the city of Marseille favor the transport of polluted air masses from the industrial cluster to the city, including the sea breeze and the light Mistral wind from the Rhône Valley. At night, the land breeze may transport air masses from an agricultural valley located east of the sampling site. A more detailed description of wind patterns in Marseille can be found in Drobinski et al. (2007) and Flaounas et al. (2009). The sampling location is classified as an urban background station and is situated in the urban park Cinq Avenue in a traffic-free zone near the city center (43 • 18 20 N, 5 •  In total, 216 24 h (from midnight-to-midnight) integrated PM 2.5 pre-baked (500 • C for 3 h) quartz fiber filters (150 mm diameter, Tissuquartz) were collected between 30 July 2011 and 20 July 2012 using a high-volume sampler (Digitel DA80) operated at 500 L min −1 (Batch 1). Filter samples were subsequently wrapped in aluminum foil, sealed in polyethylene bags and stored at −18 • C.

Offline-AMS analysis
This work discusses the offline-AMS analysis of 55 composite samples (created from the batch of 216 PM 2.5 filters collected) which were analyzed by Salameh et al. (2017) for major ions, molecular markers and elements (Table S1 in the Supplement). A thorough description of the offline-AMS analysis can be found in . One punch per filter sample (from 5 to 25 mm diameter depending on the filter loading and on the number of punches per composite sample) was prepared for analysis. Punches from the same composite sample were extracted together in 15 mL of ultrapure water (18.2 M cm, total organic carbon < 5 ppb, 25 • C) in an ultrasonic bath for 20 min at 30 • C. After extraction, filters were vortexed for 1 min, and the resulting liquids were filtered with 0.45 µm nylon membrane syringe filters. The generated liquid extracts were atomized in air using a custom-made two-nozzle nebulizer. The generated aerosol was dried using a silica gel diffusion drier and then measured by a high-resolution time-of-flight AMS (HR-ToF-AMS, running in V-mode). In the AMS, particles are flash vaporized (600 • C) and the resulting gas is then ionized by electron impact (70 eV), yielding quantitative mass spectra of the non-refractory submicron aerosol components, including OA, NO − 3 , SO 2− 4 , NH + 4 , and Cl − . A detailed description of the AMS operating principles, calibration protocols, and analysis procedures are provided by DeCarlo et al. (2006). In total about 10 mass spectra (mass range 12-300 Da, 60 s averaging time) were collected per composite sample. Between each sample, a measurement blank was recorded via nebulization of ultra-pure water to minimize and monitor the possible memory effects of the system. In total five mass spectra were collected per each measurement blank. Offline-AMS data were processed and analyzed using the HR-ToF-AMS analysis software SQUIRREL (Sequential Igor data Retrieval) v.1.52L and PIKA (Peak Integration by Key Analysis) v.1.11L for IGOR Pro software package (Wavemetrics, Inc., Portland, OR, USA). HR analysis of the mass spectra was performed in the mass range 12-115 Da and in total 217 ion fragments were fitted.
The interference of NH 4 NO 3 on the CO + 2 signal was corrected according to Pieber et al. (2016) where the CO 2,meas NO 3,meas NH 4 NO 3 ,pure correction factor was 2.5 % as determined from aqueous NH 4 NO 3 measurements conducted regularly during the measurement period.

Other offline measurements
A complete list of the measurements performed can be found in Table S1. To summarize, major ions (Ca 2+ , Mg 2+ , K + , Na + , NH + 4 , NO − 3 , SO 2− 4 , Cl − , oxalate, malate, succinate, and malonate) were measured by IC according to the methodology described by Jaffrezo et al. (1998). A subset of the filters was selected for CO 2− 3 quantification following the method described by Karanasiou et al. (2011). The method encompasses the fumigation of the filter samples with HCl. The CO 2 evolved by this acidification of the carbonates deposited on the filters is detected by thermal-optical transmittance determination. The CO 2− 3 measurements agreed fairly well with the CO 2− 3 estimate from ion balance calculations based on IC data (Fig. S1 in the Supplement). In the following discussion, ion concentrations from filter samples always refer to the IC measurements unless otherwise specified.
EC and OC were determined for each filter by thermaloptical transmittance using a Sunset Lab analyzer (Birch and Cary, 1996) following the EUSAAR2 protocol (Cavalli et al., 2010). The CO 2− 3 concentration determined from the IC ion balance was then subtracted from OC concentration. The water-soluble OC (WSOC) was measured with a total organic carbon analyzer (TOC) following the methodology described in Bozzetti et al. (2016) and references therein. Before the analyses, the liquid extracts were treated with a 2 M HCl solution for 1-30 min to remove the inorganic C fraction. Total nitrogen was determined using a TOC analyzer combustion tube. The NO 2 generated from the watersoluble N decomposition was detected by a chemiluminescence TNM-1 unit detector. Organic markers were measured via GC-MS analysis, following the methodology described in El Haddad et al. (2009), Favez et al. (2010, and Piot et al. (2012). In total 15 different PAHs, 19 alkanes (C19-C36), 8 hopanes, 5 phthalate esters, levoglucosan, 6 lignin pyrolysis compounds, 6 fatty acids, and 3 sterols were determined (Table S1). Thirty-three chemical elements (Table S1) were quantified using ICP-MS according to the procedure described in Chauvel et al. (2010) and the modifications suggested in El Haddad et al. (2011). A subset of 20 composite samples was selected for the quantification of methylnitrocatechol isomers (Table S1) via ultra-performance liquid chromatography coupled with an electrospray ionization ToF-MS (UPLC-ESI-ToF-MS), following the procedure described in Iinuma et al. (2010).

Intensive winter campaign
A HR-ToF-AMS was deployed at the same station (urban park Cinq Avenue) between 25 January 2011 and 2 March 2011 to monitor the real-time NR-PM 1 aerosol chemical composition. Although February 2011 is not included in the sampling period covered by offline-AMS, these online measurements provide a good opportunity to compare the separation, relative contributions, and winter seasonal trends of the OA sources retrieved by the offlineand online-AMS source apportionment procedures. Summer offline-AMS results were instead compared with online-AMS source apportionment results reported by El . The AMS was operated with an averaging time of 8 min, and in total 5633 mass spectra were collected during the monitoring period. We performed an ionization efficiency (IE) calibration by NH 4 NO 3 nebulization, and the resulting IE value of 1.76 × 10 −7 was applied to the dataset. The standard relative ionization (RIE) efficiency was assumed for organics (1.4), SO 2− 4 (1.2), NH + 4 (4), and Cl − (1.3), while the collection efficiency (CE) was estimated using the composition-dependent collection efficiency model (Middlebrook et al., 2012). Total AMS-PAHs were estimated from AMS data according to Dzepina et al. (2007).
Similarly to offline-AMS, online-AMS data were also processed and analyzed using HR-ToF-AMS Analysis software SQUIRREL v.1.52L and PIKA v.1.11L for IGOR Pro software package (Wavemetrics, Inc., Portland, OR, USA). HR analysis of the mass spectra was performed in the mass range 12-115 Da and in total 215 ion fragments were fitted.
A NO x analyzer was run in parallel to the AMS to monitor the real-time NO x concentration. A set of pre-baked (500 • C for 3 h) 24 h integrated PM 2.5 filter samples was also collected during this campaign (Batch 2) following the same sampling and storage procedure described in Sect. 2.2. Filters were analyzed for major ions, metals, EC / OC, and organic markers, including n-alkanes, hopanes, PAHs, and lignin and cellulose pyrolysis products, using the techniques previously described in Sect. 2.2 (Table S1).

Implementation
The online-and offline-AMS source apportionment results discussed in this work were obtained from PMF analysis (Paatero and Tapper, 1994) of AMS spectra using the Multilinear Engine (ME-2; Paatero, 1999). The Source Finder toolkit (SoFi; Canonaco et al., 2013, v.5.1) for Igor Pro (Wavemetrics, Inc., Portland, OR, USA) served as interface for data input and result evaluation. PMF is a multilinear statistical tool used to describe the variability of a multivari- ate dataset as the linear combination of static factor profiles times their corresponding time series, as described in Eq. (2): Here x i,j , g i,z , f z,j , and e i,j represent, respectively, elements of the data matrix, factor time series matrix, factor profile matrix, and residual matrix, while subscripts i, j , and z denote time elements, variables (in our case AMS fragments), and discrete factor numbers, respectively. p represents the total number of factors selected by the user for current given PMF solution. The PMF algorithm returns only g i,z and f z,j values ≥ 0 and solves Eq. (2) by minimizing the object function Q, defined as Here s i,j is an element of the error input matrix. PMF is subject to rotational ambiguity, i.e., different G · F combinations characterized by the same Q can exist. The ME-2 implementation of the PMF algorithm offers an efficient exploration of the solution space by directing the solution toward environmentally meaningful rotations by constraining the factor profile elements f z,j for one or more z factors. In the a value implementation of ME-2, the elements of the factor profile matrix F (in our case AMS fragments) are forced to predefined values f z,j , allowing a certain variability defined by the a value, such that the modeled element f z,j satisfies Eq. (4): where n and m represent any two arbitrary variables in the normalized F matrix. A complete description of the a-value approach can be found elsewhere (Canonaco et al., 2013). For the offline-AMS source apportionment, the PMF input data matrix was constructed as follows: each composite sample is represented by approximately 10 time points i, corresponding to the ∼ 10 mass spectra collected per filter sample (Sect. 2.4). Each point of the data matrix is subtracted by the average corresponding measurement blank.
The error matrices were instead constructed as follows. For online-AMS source apportionment, the error matrix elements s i,j were calculated according to Allan et al. (2003) and Ulbrich et al. (2009) and included the uncertainty deriving from electronic noise, ion-to-ion variability at the detector, and ion counting statistics. s i,j included also a minimum error which was applied according to Ulbrich et al. (2009). For the offline-AMS source apportionment, the error term δ i,j was calculated in the same way, but a further term (σ i,j ) including the blank subtraction uncertainty was propagated according to Eq. (5): Finally for both online-and offline-AMS we applied a downweighing factor of 3 to all variables with an average signal to an average error ratio lower than 2 . No variable with an average signal to error value lower than 0.2 was detected. Dust and ash can contain significant amount of inorganic CO 2− 3 . Both the IC balance and the CO 2− 3 measurements revealed non-negligible contributions from CO 2− 3 in the PM 2.5 fraction (Fig. S1). Preliminary PMF results also resolved a factor correlating with Ca 2+ (Supplement), which was characterized by high f CO + 2 , suggesting a possible solubilization of CO 2− 3 from dust which could affect the OA mass spectral fingerprint. Overall, as discussed in the Supplement, we could not achieve a clear inorganic dust separation using PMF, and thus we opted for a correction of the PMF input matrices. The measured pH of our filter extracts was never > 8, and therefore we can exclude the presence of CO 2− 3 in the extracts and assume all solubilized CO 2− 3 to exist as HCO − 3 . Direct measurements of nebulized standard NaHCO 3 aqueous solutions revealed that thermal decomposition of HCO − 3 on the AMS vaporizer (600 • C) releases CO 2 (Fig. S2). Currently no HCO − 3 correction for the OA spectra is implemented in the standard AMS fragmentation table (Aiken et al., 2008); therefore the measured CO + 2 signal needs to be subtracted from the OA AMS spectra. Offline-AMS PMF input matrices were corrected for HCO − 3 and rescaled for WSOM i (= WSOC TOC · (OM : OC) offline-AMS ) i according to the procedure described in the Supplement.

Online-AMS source apportionment optimization
In the following we describe the optimization of the online-AMS source apportionment results. In order to optimize the source separation we performed sensitivity analyses on PMF solutions. We adopted different optimization strategies for online-and offline-AMS source apportionments (Supplement) as we encountered dissimilar mixing between sources. This is not surprising as the two methods are characterized by different time resolution and different monitoring time extension (1 year for offline-AMS, 1 month for online-AMS), which in turn results in different variabilities apportioned by the PMF algorithm (daily for online-AMS vs. seasonal for offline-AMS). In order to optimize the source separation, we performed sensitivity analyses on PMF solutions according to the following scheme: i. Number of factors are selected based on residual analysis.
ii. Qualitative evaluation of the unconstrained PMF solution in comparison with the constrained PMF solutions is performed (a-value approach: cooking OA (COA) and/or hydrocarbon-like OA (HOA) constraints) iii. Both the HOA and COA factors profiles are constrained by adopting an a-value approach. An a-value sensitivity analysis was performed (121 PMF runs performed scanning all the COA and HOA a-value combinations, a-value scanning steps: 0.1).
iv. The 121 PMF runs are classified based on the cluster analysis of the COA diurnal cycles; the best clusters, and corresponding PMF solutions, are selected.
v. PMF rotational ambiguity exploration. 100 bootstrap (Davison and Hinkley, 1997;Brown et al., 2015) PMF runs were performed by simultaneously varying the COA and HOA a-value combinations (using only the optimal a-value combinations identified from step iv). The average of the 200 bootstrap runs represented the online-AMS source apportionment average solution.
The corresponding standard deviation represents the source apportionment uncertainty.
For online-AMS we selected a four-factor solution based on residual analysis. We investigated the time-dependent Q(t)/Q exp (t) evolution when increasing the number of factors. Q/Q exp is defined as the ratio between Q (as defined in Eq. 3) and the remaining degrees of freedom of the model solution (Q exp ) calculated as i ·j −(j +i)p (Canonaco et al., 2013). A decrease of the Q/Q exp , from lower-to higher-order solutions indicates an improvement in the variation explained by the model. In particular we calculated the (Q/Q exp (t)) obtained as the difference between the Q/Q exp (t) for a factor solution minus the Q/Q exp (t) value obtained from the (z − 1)-factor solutions, where z indicates the number of factors. We observed a large reduction of (Q/Q exp (t)) until four factors (Fig. S4). Higher-order solutions provided only minor contributions to the explained variability and, in terms of solution interpretability, resulted in a splitting of primary sources which could not be unambiguously associated with specific aerosol sources or processes.
Using an a-value approach, we constrained HOA and COA profiles from Mohr et al. (2012) and , respectively. Leaving COA and/or HOA unconstrained enabled resolving COA only by increasing the number of factors (> five-factor solutions) while in the fourfactor solutions we observed a splitting of an OOA factor which could not be attributed to specific processes. Unconstrained PMF yielded HOA and COA time series correlated well with the constrained solutions; however in the unconstrained case, HOA and COA factor profiles showed higher f CO + 2 in comparison to literature studies Mohr et al., 2009Mohr et al., , 2012Bruns et al., 2015;Docherty et al., 2011;Setyan et al., 2012;He et al., 2010) and in comparison to the constrained PMF runs. This in turn resulted in higher HOA and COA concentrations, with background night concentrations 2-3 times higher than in the constrained solutions, possibly indicative of mixing with oxidized aerosols (Fig. S5). Similar differences between constrained and unconstrained PMF runs were also observed in Elser et al. (2016). Also the HOA : NO x ratio (µg m −3 / µg m −3 ) matched typical literature values reported for France (0.02;Favez et al., 2010) in the constrained PMF case (0.023), while for the unconstrained approach it showed higher values (0.033).
For both offline-and online-AMS the constrained HOA profiles were from Mohr et al. (2012), while the COA profiles were from Crippa et al. (2013). The HOA profile from Mohr et al. (2012) was selected for offline-AMS consistently with Daellenbach et al. (2016), since the same factor recovery distributions were applied in this work. The same profile was applied to online-AMS for consistency. Overall, as discussed in the Supplement, the HOA profiles from literature showed high cosine similarities with each other, indicating that the AMS mass spectral fingerprints from traffic exhaust are relatively stable from station to station and consistent also with direct emission studies, making the selection of the constrained factor profiles not crucial. More variability instead is observed among COA literature profiles. For COA we selected the profile from Crippa et al. (2013), which showed the lowest f C 2 H 4 O + 2 value among the considered ambient literature spectra Mohr et al., 2012). This guaranteed a better separation of COA from BBOA, as C 2 H 4 O + 2 is strongly related to levoglucosan fragmentation .
An a-value sensitivity analysis was performed by scanning all possible a-value combinations for HOA and COA given by an a-value range 0-1 with a step size of 0.1. In order to optimize the source apportionment results, we retained only the PMF solutions satisfying an acceptance criterion described hereafter.
PMF factors were associated with specific aerosol emissions/processes based on mass spectral features, diurnal cycles, and time series correlations with tracers. The identified factors were associated with traffic (HOA), cooking (COA), biomass burning (BBOA), and OOA. A thorough interpretation of the PMF factors will be discussed in Sect. 3.1. Given the absence of widely accepted tracers for COA emissions, the optimization of the COA contributions was based on the analysis of the COA diurnal cycles. From the HOA and COA a-value sensitivity analysis we obtained a set of 121 PMF solutions, each one including both factor profiles and factor time series. PMF solutions obtained in this way were categorized according to a cluster analysis of the normalized COA diurnal cycles (Elser et al., 2016, and references therein). The k-means clustering approach enables classifying the PMF so-lutions into k clusters by minimizing a cost function (C): where C represents the sum of the Euclidian distances between each observation (x i ) and its respective cluster center (µ zi ), according to Eq. (6). The number of clusters (k) that best represents the data is a critical choice in order to perform a proper cluster analysis. The addition of a cluster (k +1) on one hand adds complexity to the solution but on the other hand decreases the cost function. A typical strategy to select the right number of clusters is to explicitly penalize the addition of new clusters by using Bayesian information criteria. This approach consists in adding a penalty term to Eq. (6) proportional to the number of clusters (k): where D denotes the dimensionality of the clusters (24 in our case, as we consider diurnal cycles with hourly time resolution). In this study the C function showed the minimum at five clusters (Fig. S6). The absence of convexity properties (i.e., several local minima can exist and the solution strongly depends on the initialization) represents a possible drawback of the k-means algorithm; therefore 100 random initializations of the k-means algorithm were conducted. The best clusters were selected based on a novel statistical analysis of the HOA, COA, and BBOA average cluster spectra (Supplement). Briefly, a cluster was retained when the HOA, COA, and BBOA average cluster spectra were not statistically different from the average reference HOA, COA, and BBOA spectra from literature Mohr et al., 2009Mohr et al., , 2012Bruns et al., 2015;Docherty et al., 2011;Setyan et al., 2012;He et al., 2010, Table S3). A complete description of the best clusters selection is reported in the Supplement (Figs. S6-S10). Overall, three clusters were retained and two were rejected. Finally, we retained only the PMF solutions that were attributed to the three best clusters in more than 95 % of the k-means random initializations (Fig. S9).
In order to explore the rotational ambiguity of our PMF model we performed 200 PMF runs by initiating the PMF algorithm using different input matrices. The 200 different input matrices were generated using a bootstrap approach (Davison and Hinkley, 1997;Brown et al., 2015). In short, the bootstrap approach creates new input matrices by randomly resampling mass spectra (i elements) from the original input matrices. Note that some mass spectra are resampled multiple times, while others are not represented at all. On average we randomly resampled 63 ± 1 % of the original spectra per bootstrap PMF run. Finally, each bootstrap PMF run was initiated by randomly varying the HOA and COA a values using the { a-value HOA; a-value COA } combinations previously selected as optimal from the cluster analysis (Fig. S10). Only solutions showing a higher COA diurnal correlation with the three selected clusters than with the Atmos. Chem. Phys., 17, 8247-8268, 2017 www.atmos-chem-phys.net/17/8247/2017/ two rejected clusters were retained. In this way we rejected 3.7 % of the solutions. In the following we present the average bootstrap solution. The source apportionment uncertainty was calculated as the variability of the retained bootstrap PMF runs.

Offline-AMS source apportionment optimization
In this section we discuss the optimization of the offline-AMS source apportionment. The PMF input matrices included 217 ions and 538 time elements deriving from about 10 AMS mass spectral repetitions collected for each of the 54 composite samples.
In order to optimize the source separation, we performed sensitivity analyses on PMF solutions according to the following scheme: i. The number of factors are selected based on residual analysis.
ii. The unconstrained PMF solution is qualitatively evaluated in comparison with the constrained PMF solutions (a-value approach: COA and/or HOA constraints) iii. We explored the PMF rotational ambiguity exploration by performing 1080 bootstrap (Davison and Hinkley, 1997;Brown et al., 2015) PMF runs while simultaneously varying the COA and HOA a-value combinations. PMF solutions were retained based on the correlation of the PMF factors with external tracers. The PMF solutions retrieved from this step are relative to the watersoluble fraction. The corresponding water-soluble OC factor concentrations were determined by dividing the water-soluble OM factor concentrations (PMF output) by the OM : OC ratio determined from the corresponding factor mass spectra.
iv. Retained water-soluble OC PMF solutions from step (iii) were rescaled to the total OC concentrations by applying factor recoveries. Factor recoveries were fitted (using a priori information) to match total OC. Only PMF solutions and factor recoveries fitting OC with yearly and seasonally homogenous residuals were retained. The average of the retained PMF solutions represented the average source apportionment results. The corresponding standard deviation represented the source apportionment uncertainty.
Based on analysis of the PMF residuals, we selected a five-factor solution to explain the variability of our dataset (Fig. S11). Similar to online-AMS, we monitored the decrease in Q/Q exp when increasing the number of factors (z).
In this study, a large Q/Q exp decrease was observed until five factors. We also observed a clear Q/Q exp structure removal until five factors, with higher-order solutions leading to additional factors that were not attributable to specific aerosol sources or processes. The five separated factors included HOA, COA, BBOA, OOA, and industry-related OA (INDOA). The complete validation of the PMF factors will be discussed in Sect. 3.2. As already mentioned, the HOA and COA profiles were constrained using an a-value approach. Consistently with online-AMS we constrained the profiles according to Mohr et al. (2012) and Crippa et al. (2013), respectively. Unconstrained PMF runs for offline-AMS did not resolve HOA and COA factors. To explore the rotational ambiguity of our PMF model we performed 1080 bootstrapped PMF runs. In this case we performed a higher number of bootstrap runs than online-AMS because the COA and HOA a-value combinations could not be separately optimized because the offline-AMS method cannot resolve diurnal patterns. Each PMF run was also initiated using different input matrices. As previously mentioned the input matrices contained about 10 mass spectral repetitions per filter sample, and therefore the bootstrap algorithm was implemented to randomly resample 54 filters samples, each one with all the corresponding mass spectral repetitions. The final generated matrices included 54 samples; note that some filter samples could be resampled more times, while others were not resampled at all. On average 63 ± 5 % of the original samples were resampled. Finally, each of the PMF runs was initiated by randomly varying the HOA and COA a values. The optimal PMF solutions were selected based on six acceptance criteria including 1. significantly (p = 0.05) positive Pearson correlation coefficient R between BBOA and levoglucosan; 2. significantly positive R between HOA and NO x ; 3. significantly positive R between INDOA and Se; 4. BBOA correlation with levoglucosan (R) significantly higher than the correlation between COA and levoglucosan; 5. HOA correlation with NO x significantly higher than the correlation between COA and NO x ; 6. INDOA correlation with Se significantly higher than the correlation between COA and Se.
Criteria 1-3 analyze the correlation between factor and marker time series. The significance of a correlation was determined by calculating the Fisher transformed correlation coefficient l (Garcia, 2011): where R is the Pearson correlation coefficient between factor and marker time series. Subsequently we conducted a t test to verify the significance (α = 0.95) of the correlation: Here, N represents the number of samples (54). For a confidence interval of 95 % the minimum significant correlation was R = 0.23. For criteria 4-6, in order to evaluate whether HOA, BBOA, and INDOA correlated significantly better than COA with their corresponding markers, we compared the l values obtained between each factor and its corresponding tracer (e.g., BBOA and levoglucosan) and between COA and the same tracer (e.g., levoglucosan), using a standard error on the l distribution of 1/ √ N − 3 (Zar, 1999). In total, we retained 1.5 % of the PMF runs. The criteria that discarded the largest number of solutions were the ones based on the COA (4-6) correlation with tracers of other sources. This suggests that for this dataset the COA separation from other sources was particularly difficult due to the absence of data with high temporal resolution, which aid the separation of a distinct COA diurnal cycle. Moreover, this separation is also complicated by the small COA contribution estimated by both online-and offline-AMS source apportionments (on average 0.4 µg m −3 as discussed in the following sections). Furthermore, the relatively small COA factor recovery (R COA median 0.54) hampers the COA apportionment by offline-AMS.
The PMF performed on offline-AMS mass spectra returned water-soluble OA factor concentrations, WSKOA i . To rescale the water-soluble OA concentration to the total OA, KOA i , we used the factor recoveries (R k ) reported by  for the HOA, COA, BBOA, and OOA factors (R HOA , R COA , R BBOA , R OOA ).
This is the first offline-AMS study where an INDOA factor was identified. Therefore, we determined the INDOA recovery (R INDOA ) in this study by performing a single parameter fit according to Eq. (11): Five hundred different fits were performed for each of the retained PMF solutions. Moreover each fit was initiated using different R KOA combinations randomly selected from the R KOA combinations determined by  and reported in Bozzetti et al. (2016). In order to account for possible WSOC and OC systematic measurement biases, each fit was initiated by also perturbing the OC i , WSKOA i / (OM : OC) WSKOC , and R KOA inputs, assuming for each parameter a possible bias of 5 %, corresponding to the WSOC and OC measurement accuracy (we note that the sum of the WSKOC i / (OM : OC) WSKOC terms equals WSOC i , neglecting the PMF residuals). Finally the input OC i was randomly perturbed within its measurement uncertainty assuming a normal distribution of the errors. Among the performed fits we retained the recovery combinations and factor time series associated with OC i unbiased residuals (residual distribution centered on 0 within the first and third quartiles) for all seasons together and for summer and winter separately (Fig. S12). Accordingly, we retained 13 % of the solutions. All the retained factor recovery combinations can be found at https://doi.org/10.5905/ethz-1007-75. The median INDOA recoveries were estimated as 0.69 (first quartile 0.65, third quartile 0.73; Fig. S13), while the retained R KOA for the other sources were consistent within the quartiles with the R KOA values reported by  despite their input value being perturbed as described above. The variability of the retained solutions is considered our best estimate of the source apportionment uncertainty, which accounts for offline-AMS repeatability, R KOA uncertainties, model rotational uncertainty (explored bootstrapping the input matrices and scanning the HOA and COA a values), and R KOA uncertainties. Overall, for a generic factor KOA, we estimated the corresponding average relative uncertainty as follows: we calculated the campaign averages of the KOA concentrations for each of the v retained PMF solutions (KOA v ). The relative uncertainty of the KOA concentration was calculated as the standard deviation of KOA v divided by its average. We also explored a four-factor solution without constraining the COA profile. In this case we performed 100 bootstrap PMF runs by randomly varying the HOA a value. Results revealed that the COA separation (in the five-factor solution with COA constrained) affected the HOA separation more than the other factors (BBOA, OOA, INDOA). Overall, when comparing the four-and five-factor solutions (without and with COA constrained, respectively). HOA showed not statistically different concentrations within our estimated source apportionment uncertainty for 85 % of the samples, BBOA and OOA for 96 %, and INDOA for 94 %. This is probably due to the high similarity between COA and HOA spectra (Supplement), which are both characterized by high contributions from hydrocarbons. Figure 1 displays the stacked seasonal average concentrations of the measured PM 2.5 components (ions measured by IC, elements measured by ICP-MS, EC by the EUSAAR method, and OM estimated as the sum of the offline-AMS PMF factors). Higher concentrations were observed during winter than in summer due to the enhanced contributions of NO − 3 and OM. NO − 3 increased during winter and autumn due to NH 4 NO 3 partitioning into the particle phase at low temperatures. OM concentrations were higher during winter due to the strong BBOA contributions.  Overall OM was the dominant PM 2.5 component over the whole year, highlighting the importance of studying its sources. OM represented 46 % of the total mass with higher relative contributions during winter (51 %) than in summer (37 %). SO 2− 4 represented the second-most-abundant PM 2.5 component, contributing on average 12 % of the mass. Among the other components, EC contributed 9 % of the mass, NO − 3 9 % (13 % avg during winter and 3 % avg during summer), NH + 4 8 %, the sum of the elements 7 % (3 % during winter and 13 % during summer, possibly because of dust resuspension), CO 2− 3 6 %, and Ca 2+ 2 %. K + , Cl − , Na + , and Mg 2+ individually did not exceed 1 % of the mass. In the following, subscripts avg and med denote average and median values, respectively.

Online-AMS source apportionment validation
PMF factors were associated with aerosol sources/processes based on mass spectral features (Fig. 2), correlation with tracers (Fig. 3), and diurnal cycles (Fig. 4). In the following all the reported times are UTC + 2 local times. The HOA  In this specific dataset they could partially derive from traffic, although from the AMS-PAHs multilinear regression we estimated that 79 % of the AMS-PAHs are related to BBOA and 21 % to HOA, indicating that BBOA dominates the PAH emissions. The AMS-PAHs : HOA ratio was 0.0020, while the AMS-PAHs : BBOA was 0.0028. In general, industrial emissions can be an important source of PAHs at this location as discussed in El . In presence of an industrial contribution, the BBOA vs. AMS-PAHs correlation would decrease. In this work the correlation between AMS-PAHs and the C 2 H 4 O + 2 fragment, typically related to levoglucosan fragmentation , was high (R = 0.87) and no AMS-PAHs spike was observed without a simultaneous increase of C 2 H 4 O + 2 (Fig. S15). Moreover the industrial-related OA factor resolved by El  was clearly associated with wind directions from W-SW (225-270 • ), while in this work wind directions were oriented from W-SW only for 7 % of the monitoring time, furthermore without being associated with any significant increase in the AMS-PAHs concentration (Fig. S16) The BBOA diurnal cycle, similarly to AMS-PAHs, showed higher values at night than during the day (Fig. 4). In addition, the BBOA highest concentrations were detected at night and associated with slow wind speeds from the E-NE which is consistent with the night land breeze direction. Moreover, strong enhancements of the BBOA factor concentrations were perceived when the wind direction shifted to the E-NE (typically around 18:00 during the monitoring period), suggesting that BBOA could be transported from the valleys near to Marseille (Fig. S18).
The OOA profile showed the most oxidized mass spectral fingerprint with an O : C ratio of 0.67 in comparison to the values of 0.35 retrieved for BBOA, 0.12 for COA, and 0.03 for HOA. The OOA time series correlated well with the NH + 4 time series (R = 0.86), suggesting a probable secondary origin of the OOA factor (Lanz et al., 2008). The OOA diurnal cycle was flat, suggesting OOA to be representative of regionally transported oxygenated aerosols, consistent with the conclusions of El .

Offline-AMS source apportionment validation
PMF factors from the offline-AMS dataset were related to aerosol sources/processes based on mass spectral features (Fig. 5), seasonal trends, and correlation with tracers (Fig. 6). A comparison of the online-AMS and offline-AMS factor profiles is reported in the Supplement. In the following, for a generic k factor, we calculated the corresponding KOC i time series by dividing KOA i by the OM : OC ratio determined from the average HR-AMS factor profile.
During summer, when biomass burning contributions to EC are low, HOA correlated well with EC (R = 0.76) and yielded an HOC : EC (hydrocarbon-like OC = HOA / (OM : OC) HOA ) ratio of 0.64, similar to other European studies (El Haddad et al., 2009, and references therein). Over the whole year, the retained PMF solutions showed an HOA correlation with NO x (R) spanning between 0.23 and 0.49. These low correlations are comparable to the ones found by El  at the same station by online-AMS. In this case, the relatively low HOA correlation with NO x is probably due to the low R HOA (median 0.11) that, together with the low HOA concentration (1.5 µg m −3 avg , Sect. 4.1), results in small water-soluble HOA concentrations, leading to an uncertain HOA apportionment. This was already reported in previous offline-AMS studies Bozzetti et al., 2017). Although the HOA variability could not be well captured, the estimated HOA concentration was corroborated by the average HOA / NO x (0.02 µg m −3 / µg m −3 ), which was found to be consistent with El  for the same station and with Favez et al. (2010) for an alpine location in France.
The fourth factor (INDOA) was related to industrial emissions due to the high correlation with light alkanes (C19-  (Table S1, R = 0.31, 0.29, and 0.27, respectively), suggesting that these particular PAHs were overwhelmingly emitted by INDOA rather than BBOA. We note that phenanthrene, pyrene, and fluoranthene together represent 9.6 % avg of the PAHs mass quantified by GC-MS, indicating that in total PAHs are overwhelmingly emitted by BBOA. While Se is considered to be a unique coal marker in the literature (Weitkamp et al., 2005;Park et al., 2014), in Marseille this source is likely related to coke and steel production facilities (El Haddad et al., 2011). The average INDOA OM : OC (1.60) was intermediate between the OM : OC ratios of HOA (1.23) and COA (1.28) and those of BBOA (1.85) and OOA (1.82). El  resolved an industrial OA factor at the same station by online-AMS PMF. In that work the authors suggested a probable contribution of OOA to the resolved industrial factor, probably deriving from (photo)chemical aging during the transport from the industrial facilities to the receptor site occasionally accompanied by new particle formation processes within the industrial plume (as observed by the increased ultrafine particle number concentration associated with W-SW wind directions). Considering the average wind speed from W-SW (0.8 km h −1 ), and the distance between the receptor site and the Marseille commercial harbor (∼ 40 km), we estimate an aging time of several hours, which could lead to a more oxidized fingerprint in comparison to the fresh pri-mary emissions . Overall this factor explained the largest fraction of the variability of S-and Clcontaining organic fragments such as C 2 HSO + , CH 2 SO + , CH 3 Cl + 2 , CH 4 SO + 3 , C 3 H 3 SO + 2 , and C 7 H + 16 . The last factor was defined as OOA as it showed a highly oxygenated fingerprint with the largest CO + 2 fractional contributions (f CO + 2 ) among the apportioned factors (14 %, in comparison with 11 % for BBOA, 2 % for HOA, and 1 % for COA and INDOA). This factor showed on average the largest contributions over the year. Overall, the OOA : NH + 4 ratio was 2.3 avg , in line with the values reported by  for 25 different European sites (2.0 avg ; minimum value 0.3; maximum 7.3).
Previous offline-AMS  and online-ACSM studies (e.g., Canonaco et al., 2015) conducted in Switzerland and Lithuania reported the separation of two OOA factors characterized by different seasonal trends and different C 2 H 3 O + : CO + 2 ratios. In particular, the OOA factor characterized by the highest C 2 H 3 O + : CO + 2 ratio contributed mostly during summer and was linked to secondary OA from biogenic emissions. Here we calculated a (C 2 H 3 O + : CO + 2 ) OOA ratio by subtracting the C 2 H 3 O + and CO + 2 contributions deriving from primary sources, from the measured C 2 H 3 O + and CO + 2 : Atmos. Chem. Phys., 17, 8247-8268, 2017 www.atmos-chem-phys.net/17/8247/2017/ Overall, C 2 H 3 O + OOA and CO + 2 OOA did not show a clear seasonality (Fig. S19), which hampered the separation of two OOA sources. Even though another OOA factor was not separated, El Haddad et al. (2013) estimated for the same location during summer a substantial contribution of secondary biogenic aerosol using 14 C measurements (no measurements conducted in other seasons). As a consequence the OOA factor resolved in this work explains both secondary biogenic and aged/secondary anthropogenic sources. The absence of a clear increase in the (C 2 H 3 O + : CO + 2 ) OOA ratio in Marseille during summer could be explained by the large emissions of anthropogenic secondary OA (SOA) precursors during winter, leading to a different (C 2 H 3 O + : CO + 2 ) OOA seasonality in comparison with previous offline-AMS studies Bozzetti et al., 2016), which were conducted either at rural sites characterized by different types of vegetation or in smaller urban areas. In general, several parameters affect the biogenic SOA concentrations and their separation, e.g., intensity of the biogenic precursor sources, air masses photochemical age, and NO x concentrations. All those parameters were different in Marseille from previous offline-AMS studies which were conducted in central and northern Europe.

OA source apportionment results and uncertainties
In this study, we present one of the first OA source apportionments conducted over an entire year in the Mediterranean region. This work also represents the first comparison between HR online-AMS and HR offline-AMS source apportionments conducted at the same location, although in two different periods. Previous studies  reported a comparison between offline-AMS and online-ACSM results.
Although related to different years and size fractions (PM 1 online-AMS, PM 2.5 offline-AMS), the offline-AMS source apportionment returned average seasonal factor concentrations not statistically different to online-AMS for both winter (Fig. 7) and summer (comparison with El Haddad et al., 2013, Fig. 8). We note that the total OC concentration quantified by online-AMS for PM 1 and by the thermal-optical procedure used for the offline-AMS source apportionment of PM 2.5 was not different on a seasonal scale considering our uncertainty, which includes time variability and measurements uncertainties.
Both online-and offline-AMS source apportionment revealed that BBOA was the largest OA source during winter. Offline-AMS source apportionment estimated an average BBOA concentration during winter 2011-2012 of 5.2 µg m −3 avg , representing 43 % avg of the OA. Similarly, online-AMS source apportionment revealed a BBOA concentration of 4.4 µg m −3 avg (corresponding to 42 % of OA) during February 2011. During summer, the offline-AMS BBOA concentration dropped to an average of 0.3 µg m −3 avg , representing 5 % of the OA. Not surprisingly, such low BBOA contributions were not resolved by online-AMS source apportionment during summer . On average the offline-AMS BBOA relative uncertainty was 9 %. As a comparison, the online-AMS BBOA average relative uncertainty was 6 %. Overall for both online-and offline-AMS, the BBOA contributions were the least uncertain among the primary sources, possibly because of the high loadings and the distinct seasonal and diurnal BBOA variability in comparison with the other separated factors. A comparison between the offline-and online-AMS source apportionment uncertainties can be carried out with the caveat that the online-AMS source apportionment uncertainties estimated in this work should be considered as a low estimate as they do not account for the AMS mass error deriving mostly from CE, and particle transmission. This source of uncertainty affects the total OA mass but not the relative contribution of the factors. By contrast, the OA mass uncertainty was accounted for in the offline-AMS source apportionment as the OA mass was rescaled to external measurements (WSOC and OC), the uncertainty of which was propagated in the final source apportionment error (Sect. 2.4).
On a yearly scale, the offline-AMS source apportionment revealed that OOA was the largest OA source, with the highest relative contributions during summer due to the reduced BBOA emissions. The OOA concentration during summer was estimated from offline-AMS at 3.0 µg m −3 avg , corresponding to 55 % of the OA mass. El  also reported OOA to be the dominant OA fraction during summer with a similar average concentration of 2.9 µg m −3 . During winter, the OOA concentration was estimated by online-AMS to be 3.9 µg m −3 avg corresponding to 38 % of the OA, while the OOA relative uncertainty was 4 %. As a comparison, the OOA relative uncertainty from offline-AMS was 6 % avg . The offline-AMS source apportionment revealed similar OOA concentrations during winter (3.4 µg m −3 avg corresponding to 27 % avg of the OA). Even though during winter the OOA concentration was higher than in summer, possibly due to partitioning and to the shallower boundary layer, the relative contribution decreased because of the strong BBOA contributions.
HOA is one of the most uncertain factors, with an average relative uncertainty of 39 % estimated from offline-AMS and 10 % from online-AMS analysis, where the larger uncertainty observed for offline-AMS derives mostly from the low R HOA and from the lower time resolution, which  avg , equivalent to 4 % avg of the OA. Overall, due to the low concentrations, the COA contributions were uncertain in both source apportionments (6 % for online-AMS, 73 % for offline-AMS). Similarly to HOA, the larger uncertainty observed for offline-AMS was most possibly due to the low R COA and the low time resolution, which did not enable the COA separation based on the diurnal variability. The summer COA contribution was not resolved from HOA by El , possibly because the COA reference mass spectrum was not constrained and because of the lack of HR data which typically aid the separation of the two sources.
Finally, the INDOA factor concentration estimated from offline-AMS was on average 2.1 µg m −3 during winter and 0.6 µg m −3 avg during summer, where this seasonal trend was driven by a strong episode that occurred during early February. The offline-AMS relative uncertainty was estimated as 17 %. As previously discussed (Sect. 3.1), this factor was not separated by online-AMS analysis (February 2011) because of the absence of clear events, which in the offline-AMS dataset were observed only over a short period during January-February 2012. An industrial factor was instead resolved by El  during summer 2008, with an average concentration of 0.3 µg m −3 avg . In that study, the industrial OA factor was also characterized by a low background intercepted by 10-fold spiking episodes.
From the sum of the offline-AMS factor concentrations we estimated the total OM mass. Using this OM and the measured OC we calculated the OM : OC ratio to be 1.40 on average. Specifically, during winter this ratio was 1.55, which is consistent with the online-AMS values determined from Atmos. Chem. Phys., 17, 8247-8268, 2017 www.atmos-chem-phys.net/17/8247/2017/ Figure 9. Correlation between the sum of nitrocatechols (Table S1) with levoglucosan and BBOC.
the HR-AMS spectra (median 1.52, first quartile 1.46, third quartile 1.59). The bulk OM : OC variability was driven by the source variabilities. Indeed the relative contribution of the most oxidized source (OOA) was higher during summer (mostly due to the absence of BBOA), but also the relative contributions of the less oxidized sources (such as HOA and COA) were higher during summer mostly due to low BBOA contributions. The BBOA mass spectrum instead was associated with intermediate OM : OC ratios comprised between the values of COA and OOA, and therefore influenced less strongly the bulk OM : OC ratio. Overall the combination of these effects led to a higher bulk OM : OC during winter.

Insights into the BBOA origin during winter
Methyl-nitrocatechols measurements showed high correlations with BBOA ( Fig. 9, R = 0.95) and no correlation with OOA (R = 0.06, offline-AMS source apportionment). Similarly high correlations were already observed in other studies (e.g., Poulain et al., 2011). This large correlation difference suggests that the variability of the methyl-nitrocatechols is likely explained by the BBOA source. However, methylnitrocatechols are secondary compounds deriving from the nitration of catechols, which can be either directly emitted by wood combustion (Schauer et al., 2001) or generated by OH q oxidation of cresols directly released by wood combustion . m-cresol/NO x photooxidation experiments ) revealed a total contribution of all methyl-nitrocatechol isomers to the catechol SOA of approximately 10 %. Assuming methyl-nitrocatechols to be entirely apportioned to the BBOA factor, we estimate a methyl-nitrocatechol-SOA contribution to BBOA on the order of 8 %, indicating that part of the BBOA factor is of secondary origin. Previous studies (Atkinson and Arey, 2003) revealed an o-cresol lifetime in the atmosphere of 2.4 min towards NO 3 and 3.4 h towards OH (at 298 K, dark conditions). This would suggest that such fast SOA formation can be better traced by the high-time-resolution online-AMS source apportionment (8 min) than by the offline-AMS with 24 h time resolution, and in any case only in the BB plume or in the vicinity of the emission source. Nevertheless we did not observe statistically different ratios (within 1σ , error calculated as the time variability) of OOA : NH + 4 (1.5 avg and 1.25 avg for the offline-AMS and online-AMS source apportionments, respectively), OOA : BBOA (0.65 avg and 0.89 avg , respectively), and levoglucosan : BBOC (0.12 avg and 0.12 avg , respectively, Fig. 10) during winter, suggesting that despite the different time resolutions, the online and offline methods provide a comparable BBOA-SOA separation. Overall these findings suggest that rapid SOA formation is not well captured by PMF and rapidly formed SOA compounds (such as nitrocatechols) can be systematically attributed by PMF to factors commonly considered as "primary" (BBOA in this case). Both the online-and offline-AMS source apportionment revealed for the two different winter seasons a comparable temporal evolution of the levoglucosan : BBOC ratio (Figs. 10 and 11). This ratio showed typical literature values for domestic wood combustion in Europe during January and early February (0.05-0.2; Zotter et al., 2014;Herich et al., 2014;Minguillón et al., 2011), while during late autumn and March (Fig. 11) it increased up to 0.3, highlighting an evolution of the BBOA chemical composition. A similar seasonal trend was observed for the ratios of levoglucosan : vanillic acid, levoglucosan : syringic acid, and levoglucosan : non-sea-salt K + (nss-K + ; calculated according to Seinfeld and Pandis, 2006) ratios (Fig. 11). Although the online dataset was limited to 1 month of measurements, the levoglucosan : vanillic acid ratio also showed a statistically significant increasing trend from early February to the beginning of March (confidence interval of 95 %, Mann-Kendall test). These results suggest the occurrence of different types of biomass combustions during low-temperature winter days compared to late autumn and early spring: levoglucosan derives from cellulose pyrolysis (> 300 • C), while vanillic and syringic acids result from lignin combustion (Simoneit et al., 1998;Sullivan et al., 2008). Different reactivities/volatilities of BBOA markers may complicate this analysis. For this reason we discuss in the following the levoglucosan stability and propose that the major driver of the observed seasonal trends is the occurrence of different BBOA combustions.
Previous studies revealed the levoglucosan reactivity toward OH q radical oxidation (Hennigan et al., 2010) both in gas and aqueous phase (Hoffmann et al., 2010). In the following we analyze the levoglucosan and nss-K + time series in order to investigate the possible effects of levoglucosan chemical stability and different types of biomass combustions on the seasonal evolution of the levoglucosan : nss-K + ratio. During summer nss-K + derives mostly from dust, while levoglucosan is depleted by both photochemistry (Hen- Figure 10. Offline-AMS (February 2012) and online-AMS (February 2011) smoothed time-dependent levoglucosan : BBOC ratios. We note that the levoglucosan : BBOC comparison should not be considered on a day-to-day basis, where the levoglucosan : BBOC ratio in the 2 different years can be coincidentally equal or different, but rather on a monthly timescale where, as discussed in the paper, we observed a statistically significant (p = 0.05) evolution of the levoglucosan : BBOC ratio which is similarly captured by the two models. Figure 11. Online-and offline-AMS time-dependent levoglucosan : BBOC, levoglucosan : vanillic acid, levoglucosan : syringic acid, and levoglucosan : K + ratios. The plant wax concentrations were determined from GC-MS measurements of alkanes with an odd number of carbons . As discussed in the main text the spike observed in late autumn could be related to incomplete green waste combustion.
nigan et al., 2010) and low BBOA emissions. Not surprisingly the levoglucosan : nss-K + ratio showed lower average values in summer (0.23) than in winter (3.14). During winter nss-K + is considered to be mostly emitted by BBOA, and consistently in our dataset it shows a good correlation with BBOA tracers (R = 0.66 with syringic acid). Overall, the levoglucosan : nss-K + ratio during the cold season manifests a behavior that is opposite to the photochemical activity (with temperature considered as a proxy) as it shows higher values during March and late autumn (up to 7.11) and lower in January and February (minimum = 2.79; Fig. 11) when temperature is lower and photochemistry is less intense. For these reasons we relate the winter levoglucosan : nss-K + variability to different types of combustion rather than to a levoglucosan depletion due to photochemistry. Furthermore we observed the highest levoglucosan concentrations (late autumn) Atmos. Chem. Phys., 17, 8247-8268, 2017 www.atmos-chem-phys.net/17/8247/2017/ simultaneously with the highest relative humidity (89 %) values, suggesting the depletion of levoglucosan by OH q radical oxidation in aqueous phase to be insignificant (Hoffmann et al., 2010). A similar winter seasonal behavior was observed also for plant waxes. Plant wax concentrations were estimated from high-molecular-weight n-alkanes (C24-C35) according to the methodology described by Li et al. (2010). This methodology is based on the observation that alkanes from epicuticular waxes preferentially contain an odd number of carbon atoms (Aceves and Grimalt, 1993;Simoneit et al., 1991). This was observed for a large variety of plants including broad leaf trees, conifers, palms, shrubs, grasses, and groundcover (Hildemann et al., 1996, and references therein). Waxes showed the highest concentrations during late autumn (up to 0.16 µg m −3 ) and in May (up to 0.17 µg m −3 ), while the minima were observed during winter (minimum 0.007 µg m −3 ). In general, high-molecular-weight n-alkanes are typically detected in atmospheric aerosol in significant amounts during the growing season. In a similar way, Hildemann et al. (1996) estimated the highest plant wax concentrations in April-May in Los Angeles and Pasadena, where the climate is similar to Marseille. Similarly we observed the highest concentrations during May. However, comparable plant wax concentrations were observed also in late autumn during the period characterized by the highest levoglucosan : lignin combustion tracers (Fig. 11), suggesting a possible emission from open combustion of green wastes.
Taken together the above observations suggest the occurrence of combustion of cellulose-rich material during March and late autumn, compared to lignin-rich biomass burning for residential heating during January. The combustion of cellulose-rich material is possibly related to agricultural waste burning at the beginning and at the end of the agricultural cycle. The occurrence of emission of biomass plumes due to land clearing episodes during March has already been reported in other parts of Europe (Ulevicius et al., 2016) and has been previously modeled for southern France (Denier van der Gon et al., 2015;Fountoukis et al., 2014).
In this study we related the evolution of the BBOA composition over the cold season to the combustion of celluloserich and lignin-rich fuels, considering that lignin end cellulose are contained in different ratios in different biomass fuels. This designation should not be considered as an oversimplification of the combustion processes or of the fuel complexity but rather as a classification of the BBOA based on our observations of increasing lignin pyrolysis products over cellulose pyrolysis products during the coldest days.
We note that BBOA is described in our PMF models by only one factor which therefore potentially represents a combination of several types of biomass burning sources. Increasing the number of factors did not lead to an unambiguous separation of different BBOA sources, but the comparison with source-specific markers revealed a real BBOA composition evolution over the winter season with higher cellu-lose to lignin combustion tracer ratios observed during late autumn and early spring in comparison to January/February. This hypothesis of at least two types of BB sources (one linked to domestic heating, another to agricultural activities) is also supported by the direct PMF analysis of the organic and inorganic markers measured for Batch 1 (Salameh et al., 2017).

Conclusions
PM 2.5 filter samples were collected during an entire year (August 2011 to July 2012) at an urban site in Marseille, France. Filter samples were analyzed by water extraction followed by nebulization of the liquid extracts and subsequent measurement of the generated aerosol with an HR-ToF-AMS .
PMF analysis was conducted on the offline-AMS mass spectra and on online-AMS data collected at the same station during February 2011. Offline-AMS source apportionment results were also compared with a previous online-AMS source apportionment study of 2 weeks during July 2008 at the same location . The methods returned statistically similar seasonal factor concentrations, although different years and size fractions were considered (PM 1 for online-AMS, PM 2.5 for offline-AMS). OOA was the major source of OA during summer representing on average 55 % of the OA mass, while BBOA was the dominant OA source during winter contributing on average 43 % of the OA. Smaller contributions were estimated for HOA, INDOA, and COA, representing 17, 12, and 4 % of the OA mass, respectively. The contribution of primary anthropogenic sources (HOA + BBOA + COA + INDOA) was substantial over the year (62 % avg of OA), with larger absolute and relative contributions during winter (73 % of OA avg ) associated with an intense biomass burning activity.
Coupling offline-and online-AMS data with molecular markers showed increasing levoglucosan : BBOC ratios during the late winter-early spring period in both 2011 and 2012. This trend was also observed for the ratios between cellulose and lignin combustion markers (e.g., levoglucosan : vanillic acid), with ratios approaching more typical domestic wood combustion European values during January/early February, and values characterized by higher values of cellulosecombustion markers during late autumn and March indicative of the influence of different types of fuels, possibly related to agricultural-related activities.
From the offline-AMS source apportionment, we observed a high BBOA correlation with nitrocatechols deriving from the nitration of catechols directly emitted by biomass combustion. These secondary components are rapidly formed in the atmosphere in the presence of NO 3 q (lifetime of a few minutes). Overall, despite the different time resolution, online-and offline-AMS provided a comparable SOA-BBOA separation during winter. Nevertheless, in case of fast SOA formation (relative to the timescale of the online-AMS time resolution or relative to the transport time to the receptor site) this separation can be hindered, and further efforts are needed to improve the SOA separation from BBOA.
The Supplement related to this article is available online at https://doi.org/10.5194/acp-17-8247-2017-supplement. Competing interests. The authors declare that they have no conflict of interest.