Distinguishing molecular characteristics of aerosol water soluble organic matter from the 2011 trans-North Atlantic US GEOTRACES cruise

The molecular characteristics of aerosol organic matter (OM) determines to a large extent its impacts on the atmospheric radiative budget and ecosystem function in terrestrial and aquatic environments, yet the OM molecular details of aerosols from different sources are not well established. Aerosol particulate samples with North Americaninfluenced, North African-influenced, and marine (minimal recent continental influence) air mass back trajectories were collected as part of the 2011 trans-North Atlantic US GEOTRACES cruise and analyzed for their water soluble OM (WSOM) molecular characteristics using electrospray ionization Fourier transform ion cyclotron resonance mass spectrometry. Principal component analysis (PCA) separated the samples into five groups defined by distinct molecular formula characteristics. An abundance of nitrogen containing compounds with molecular formulas consistent with amino acid derivatives defined the two samples comprising the primary marine group (henceforth referred to as Primary Marine), which suggest a primary marine biological source to their WSOM in spite of their North American-influenced air mass trajectories. A second group of samples (aged marine, henceforth referred to as Aged Marine) with marine air mass trajectories was characterized by an abundance of low O / C (0.15–0.45) sulfur containing compounds consistent with organosulfate compounds formed via secondary aging reactions in the atmosphere. Several samples having North American-influenced air mass trajectories formed another group again characterized by organosulfate and nitrooxyorganosulfate type compounds with higher O / C ratios (0.5–1.0) than the Aged Marine samples reflecting the combustion influence from the North American continent. All the samples with North African-influenced air mass trajectories were grouped together in the PCA and were characterized by a lack of heteroatom (N, S, P) containing molecular formulas covering a wide O / C range (0.15–0.90) reflecting the desert source of this WSOM. The two marine groups showed molecular formulas that, on average, had higher H / C ratios and lower O / C ratios and modified aromaticity indices than the two continentally influenced groups, which suggests that these properties are characteristic of marine vs. continental aerosol WSOM. The final sample group, the mixed source samples (henceforth referred to as Mixed Source), showed intermediate molecular characteristics, which suggests no dominant continental or marine source. The source-specific OM details described here will aid efforts to link aerosol OM source with molecular characteristics and impacts in the environment.


. Aerosol organic matter (OM) is an important
Published by Copernicus Publications on behalf of the European Geosciences Union. 8420 A. S. Wozniak et al.: Distinguishing molecular characteristics of aerosol water soluble organic matter component within aerosols making up to 90 % of the aerosol fine particle load in some areas (Kanakidou et al., 2005). In the atmosphere, aerosol OM can absorb (or scatter) light contributing a warming (or cooling) effect to the atmospheric radiative budget via the direct effect (e.g., Ramanathan et al., 2001;Hansen et al., 2005) or act as condensation nuclei contributing to the growth of clouds impacting climate via the indirect effect (e.g., Ramanathan et al., 2001). Much of this OM is ultimately transported and deposited to terrestrial and aquatic systems where it has been implicated as an important OM source to riverine export (Likens et al., 1984;Velinsky et al., 1986;Wozniak et al., 2011) and as a complexing agent facilitating the delivery of soluble, bioavailable Fe to the ocean (Paris and Desboeufs, 2013;Wozniak et al., 2013).
The chemical characteristics of aerosol OM have a major role on its environmental effects. For example, aromatic and double bond functional groups impart a light absorbing property to OM (Andreae and Gelencsér, 2006), water soluble OM components have hygroscopic properties that increase cloud condensation nuclei formation (Chan et al., 2005), and the presence of carboxylic functionalities has been suggested to allow combustion-derived aerosol OM to complex with Fe and facilitate its delivery to the ocean in bioavailable form . With the acknowledged roles in climate and ecosystem function and the disparate sources (e.g., natural terrestrial, anthropogenic biomass and fossil fuel combustion, primary marine biogenic, secondary formation) of aerosol OM, it is important to understand the defining physical and chemical characteristics of aerosol OM from the different sources so we can model and predict ecosystem and climate responses to changes in the relative and overall magnitudes of source-specific aerosol OM emissions. Yet, at present the source specific chemical characteristics of aerosol OM are not well defined. This problem is particularly challenging for primary sea spray sources due to difficulties collecting aerosol samples in remote marine environments, analytical challenges related to the low OM atmospheric loadings of marine aerosols, and the fact that aerosols over ocean environments often contain OM from both continental and marine sources .
While global aerosol OM budgets are uncertain, models suggest that marine-derived aerosol OM, emitted via sea spray as primary aerosols or formed by secondary processes from volatile organic precursors (O'Dowd and de Leeuw, 2007;Gantt and Meskhidze, 2013), account for approximately 10 Tg (range: 2-100 Tg yr −1 ; Gantt and Meskhidze, 2013 and references therein) of the estimated 150 Tg organic aerosols emitted each year globally (range: 60-240 Tg yr −1 ; Hallquist et al., 2009). Access to the marine environment for the purposes of aerosol sampling can be difficult, and as a result, much less is known about the chemical composition of aerosol OM from marine as compared to continental sources. Much of what is known about marine aerosol OM chemical composition results from studies of aerosols collected at coastal or remote island sites (e.g., Gagosian et al., 1981;Facchini et al., 2008a, b;Claeys et al., 2010;Ovadnevaite et al., 2011a, b), shipboard during research cruises (e.g., Schmitt-Koplin et al., 2012), aboard research flights Russell et al., 2010), or via bubble bursting experiments that produce aerosols artificially from seawater (e.g., Kuznetsova et al., 2005;Facchini et al., 2008b;Schmitt-Koplin et al., 2012). These studies have identified likely biological primary emissions of marine aerosol OM such as carbohydrate or polysaccharide-like and amino acid or protein-like compounds Russell et al., 2010) as important constituents. Other studies have detected organosulfate compounds (Claeys et al., 2010; and small organic ammonium compounds (Facchini et al., 2008a), that ae thought to be derived from marine secondary source. Still other studies have identified various carboxylic acid compounds (Decesari et al., 2011;Schmitt-Koplin et al., 2012) and aliphatic amines (Decesari et al., 2011).
While these studies have increased our understanding of aerosol OM composition that allow us to relate aerosol OM composition to environmental impact, much work still needs to be done to understand what governs the molecular composition of aerosol OM and its components (e.g., water insoluble OM (WIOM), water soluble OM (WSOM), particle size fractions) in various parts of the world's oceans, at different times of the year, subject to different influences (e.g., aerosol sources, oceanographic and meteorological conditions), and using all available analytical tools. In particular, studies that increase coverage of the ocean and that are able to distinguish between marine and continental sources are needed. Analyses of field-collected marine aerosol OM seeking to unambiguously identify marine-derived components are complicated by the presence of continentally transported aerosol OM over the marine environment, but multivariate statistical approaches offer a way to distinguish defining differences among data sets. For example,  used positive matrix factorization to separate aerosols collected over the southeast Pacific Ocean into a marine component containing organic hydroxyl functional groups indicative of carbohydrate-like compounds and a saturated aliphatic carboxylic acid combustion component.
The WSOM fraction of aerosols is thought to enhance cloud condensation nuclei formation (Chan et al., 2005), has been investigated as a possible ligand for bioavailable and soluble Fe  for biological production in high nutrient low chlorophyll regions, and is the focus of this study. WSOM is analyzed here by Fourier transform ion cyclotron resonance mass spectrometry (FTICR MS) coupled to electrospray ionization (ESI), an analytical tool that effectively characterizes natural OM and has begun to be used to study aerosol OM (e.g., Reemtsma et al., 2006;Wozniak et al., 2008;Lin et al., 2012) including marine aerosol OM (Schmitt-Koplin et al., 2012). This study examines the WSOM characteristics of 24 aerosol samples collected as part of the 2011 trans-North Atlantic US GEOTRACES cruise. Principal component analysis (PCA) is performed on data derived from the FTICR MS spectra obtained for the 24 samples to investigate the defining molecular characteristics of aerosols influenced to varying degrees by primary and secondary marine, continental combustion (North America), and continental dust (North Africa/Saharan dust) sources. In so doing, this study provides (1) increased spatial coverage of the molecular characteristics of aerosol WSOM over a vast portion of the North Atlantic Ocean for which aerosol WSOM has not been characterized and (2) important information on the source-specific molecular characteristics of aerosol WSOM produced in and transported to the marine environment.

Sample collection, handling, and storage
Aerosol total suspended particulate samples were collected aboard the R/V Knorr along a transatlantic transect between 7 November and 9 December 2011 (n = 24) ( Fig. 1) as part of the 2011 US GEOTRACES program cruise (www.geotraces. org). Briefly, air was pulled through precombusted (480 • C, 10 h) quartz microfiber (QMA) filters (20.3 cm × 25.4 cm, 406 cm 2 exposed area, 0.6 µm effective pore size) deployed in a high-volume aerosol sampler (model 5170-BL, Tisch Environmental) operating at approximately 1.2 m 3 air min −1 . These high-volume aerosol samplers are similar to those used in many studies of marine aerosols (Arimoto et al., 1995;Prospero, 1999;Baker et al., 2006). Material retained on the filters was operationally defined as "total suspended particulates". Sample face velocities were 50 cm s −1 and durations averaged 19.4 h (range = 5.9-29.6 h) with the volume of air filtered averaging 1440 m 3 (range = 400-2200 m 3 ). Sample details can be found in the Supplement (Supplement Table S1). Filters were visually inspected during sampling to estimate particulate loading, and where low particulate loads were observed the samplers were allowed to run for longer durations. All filters were stored frozen prior to processing, both on the ship and upon returning to the laboratory.
Samplers were deployed on the ship's flying bridge (14 m a.s.l.) as high off the water as possible. Contamination from the ship's stack exhaust was avoided by controlling aerosol sampling with respect to wind sector and wind speed using an anemometer interfaced with a Campbell Scientific CR800 datalogger. The samplers were allowed to run when the wind was ±60 degrees from the bow and > 0.5 m s −1 . When the wind failed to meet these two criteria, the motors were shut off automatically and not allowed to restart until the wind met both the speed and direction criteria for 5 continuous minutes. The anemometer was deployed nearby on a separate pole in "free air" where turbulence from the wind Samples labeled as "N Am", and "N Afr" showed a majority of 5day ensemble air mass back trajectories extending back to the North American and African continents. Samples labeled "Mar" showed 5 day ensemble air mass back trajectories that had minimal continental influence.
crossing the bow did not cause the wind vane to wobble excessively.

WSOM Isolation
The QMA filter samples were cut into subsamples using ceramic-bladed scissors, stored frozen in the dark to preserve the OM, and 1/4 of each sample was shipped on ice to Old Dominion University (ODU) and stored frozen prior to analysis. WSOM was isolated following published procedures for desorbing aerosol WSOM from total suspended particulate material on filters (e.g., Siefert et al., 1994;Wozniak et al., 2008Wozniak et al., , 2012. Aerosol subsamples (∼ 29 cm 2 ) were placed in pre-combusted glass jars, and 25 mL of ultrahigh purity (MilliQ) water was added. The jars were wrapped in aluminum foil to keep out ambient light and placed on an orbital shaker table (VWR model 57018-754) for 2 h, which is sufficient time to desorb > 90 % of WSOM (Wozniak et al., 2012). The water was then filtered through a pre-combusted GF/F filter (0.7 µm pore size) using a pre-cleaned (10 % HCl) syringe to isolate the WSOM. Ten mL of WSOM was saved for FTICR analyses, and the remaining 15 mL was frozen and saved for analyses not discussed here.

Fourier Transform Ion Cyclotron Resonance Mass Spectrometry (FTICR MS)
Aerosol WSOM (10 mL) was solid-phase extracted using a styrene divinyl benzene polymer (PPL, Varian) cartridges following published procedures (e.g., Dittmar et al., 2008;Mitra et al., 2013). The PPL cartridge was rinsed with two cartridge volumes of LC-MS grade methanol and acidified Milli-Q water before the pre-filtered and acidified aerosol WSOM was loaded onto the cartridge. The relatively hydrophobic dissolved OM (DOM) is retained on the PPL cartridge while highly hydrophilic WSOM and salts (which would otherwise compete with the aerosol WSOM for charge during ESI) pass through the cartridge. The cartridge was then rinsed with 0.01 M HCl to ensure complete removal of salts, dried under a stream of ultrapure nitrogen, and eluted with one cartridge volume (∼ 3 mL) of methanol. Previous work has estimated that ∼ 60 % of dissolved OM is retained using this technique (Dittmar et al., 2008). Though previous studies have added ammonium hydroxide prior to ESI in order to increase ionization, tests showed that higher quality spectra (more organic matter peaks) were obtained without the addition of base, and samples were infused to the ESI in methanol alone. Samples were continuously infused into an Apollo II ESI ion source of a Bruker Daltonics 12 T Apex Qe FTICR, housed at ODU's COSMIC facility. Samples were introduced by a syringe pump at 120 µL h −1 . All samples were analyzed in negative ion mode; ions were accumulated in a hexapole for 0.5 s before being transferred to the ICR cell, where 300 transients were co-added. The summed free induction decay signal was zero-filled once and sinebell apodized prior to fast Fourier transformation and magnitude calculation using the Bruker Daltonics Data Analysis software. All mass spectra were externally calibrated with a polyethylene glycol standard and internally calibrated with naturally occurring fatty acids and other homologous series present within the sample . A molecular formula calculator (Molecular Formula Calc v. 1.0 NHMFL © , 1998) generated molecular formulas using carbon ( 12 C 8−50 ), hydrogen ( 1 H 8−100 ), oxygen ( 16 O 1−30 ), nitrogen ( 14 N 0−5 ), sulfur ( 32 S 0−2 ), and phosphorous ( 31 P 0−2 ). Peaks identified in process blanks (PPL extract of QMA filter blank WSOM) were subtracted from the sample peak list prior to formula assignment. Only m/z values in the range of 200-800 with a signal to noise ratio above 3 were used for molecular formula assignments. The mean mass resolution for all samples over that mass range was 560 000.
Constraints corresponding to the standard range of atomic composition for natural OM were applied during formula assignment following previous work (Stubbins et al., 2010;Wozniak et al., 2008) (3) N / C ≤ 0.5; (4) S / C ≤ 0.2; (5) P / C ≤ 0.2; (6) DBE ≥ 0 and an integer value. The term DBE represents the number of double bond equivalents, which is the number of double bonds and rings in a formula (e.g., Hockaday et al., 2006). The measured m/z values and assigned formula calculated exact masses all agreed within the maximum allowed error of 1.0 ppm, and > 90 % of formulas were within 0.5 ppm. An unequivocal formula is found for m/z values below 450, but above this, multiple formulas may match the measured m/z value. In order to ensure a unique formula per peak, a formula extension approach similar to that described by Kujaw-inski et al. (2009) was used to identify appropriate formulas. Additional constraints are placed on the proportion of heteroatoms using the following criteria (Kujawinski et al., 2009): (1) each formula should have numbers of N and S atoms that are each fewer than the number of oxygen atoms, and (2) the sum of the N and S atoms should be the lowest possible.
The spectral magnitude for a given peak results from a combination of the concentrations of the isomeric compounds representing a given molecular formula in the actual WSOM, how ionizable those compounds are using ESI, and the FTICR MS analytical window. ESI FTICR MS is thus not a purely quantitative technique. Nonetheless, for comparisons of samples measured using the same instrumental conditions, the relative spectral magnitude contributed by elemental formulas and formula groups (magnitude-weighted contributions) has been used as an instructive qualitative measure (e.g., Wozniak et al., 2008;Sleighter et al., 2009;Mazzoleni et al., 2010;O'Brien et al., 2012;Schmitt-Koplin et al., 2012). Magnitude-weighted elemental formula group fractional contributions (e.g., CHO, CHON, CHOS, etc.) were calculated as follows: where X w represents the magnitude-weighted elemental formula fractional contribution, and M X denotes that the relative magnitudes are summed over the elemental formula combination (e.g., CHO, CHON, CHOS, CHONS, CHONP, CHOSP) of interest. Magnitude-weighted O / C, H / C, and modified aromatic index (AI mod ) values for a given sample were calculated as follows: where Y w represents the magnitude-weighted parameter of interest (O / C, H / C, AI mod ), n denotes that the summation includes all assigned formulas in a given sample, and M represents the relative spectral magnitude for the assigned elemental formula. The O / C and H / C ratios are calculated by dividing the number of O (or H) atoms by the number of C atoms in a given assigned formula. AI mod is a measure of the probable aromaticity for a given OM molecular formula assuming that half of the oxygen atoms are doubly bound and half are present as σ bonds (Koch and Dittmar, 2006): where C, O, S, H, N, and P represent the number of carbon, oxygen, sulfur, hydrogen, nitrogen, and phosphorus atoms in a given molecular formula. The AI mod scale ranges from 0 for a purely aliphatic compound to 1 with increasing values found for compounds showing higher numbers of double bonds and relative aromaticity.

Principal Component Analysis (PCA)
The PCA was performed on FTICR MS data sets using an in-house developed Matlab script. Before being utilized for PCA, the 500 most intense peaks that were assigned molecular formulas were identified for each sample. This amounted to a total of 3077 molecular formulas that represented the input variables for the PCA. The relative intensity (the intensity of a peak divided by the summed intensities for all peaks assigned formulas, a score of 0 was given to molecular formulas not present in a sample) of each of these peaks within all 24 samples was determined and used to generate the matrix for the PCA. The PCA was performed for this initial list of 3077 molecular formulas and with the 2408 molecular formulas that were present in at least 3 samples (> 12 % of all samples). The OM compounds represented by this molecular formula list (n = 2408) are responsible for the majority (52-92 %) of the spectral magnitude in the samples indicating that the compounds are high in concentration and/or are efficiently ionized and detected by negative ion ESI FTICR MS. Using this reduced molecular formula list concentrated the analyses on the most abundant peaks and removed the influence of molecular formulas present in only a few samples. PCA uses an orthogonal transformation to reduce multiple variables into fewer dimensions called principal components (PCs) that are generated by the PCA and defined by PC loadings ascribed to each variable (in this case, the 2408 or 3077 assigned molecular formulas). Each sample receives a score for each PC based on the magnitude of its input variables and the corresponding loadings for those variables to each PC. Samples with high PC1 scores (for example) can thus be expected to have high contributions from variables showing high loadings for PC1. The first PC explains the largest portion of the variance in the data set, and each successive PC is orthogonal to the previous PC and explains the largest portion of the remaining variance in the data set. PCs can then be plotted on 2-D and 3-D plots created from sample scores to evaluate the differences between samples.

Air mass back trajectory simulations
Air Mass Back Trajectory Simulations were performed using the publicly available NOAA ARL HYSPLIT model (Rolph, 2013;Draxler and Rolph, 2013). Back trajectories (120 h) were calculated from the ship's position and collection time using the ensemble form of the model. The ensemble mode reveals the effects from small-scale meteorological variability on the trajectories (Draxler, 2003). In order to investigate the behavior and fate of aerosols in the marine boundary layer (MBL, typically 400-1200 m thickness), an arrival height of 250 m was chosen. This height is also the minimum height for optimal configuration of the ensemble. Given the uncertainty of the air mass behaviors within the MBL, this height was chosen to best characterize air mass trajectories at our sample collection height. Simulations were also conducted using higher (1500 m) arrival heights, which returned similar results (data not shown). Air masses were characterized operationally as North American or North African if more than half of the air mass trajectory simulations extended to these regions, as is shown in Fig. 2. If less than half of 5 day air mass trajectory simulations extended to continents, the sample was classified as "marine". It is noted that air masses categorized as "marine" may very well contain aerosol OM from distant (> 5 days) continental sources. Additionally, because all the samples were collected over the marine environment, they are all expected to be influenced by marine sources to some degree. Where possible, visual inspection of the QMA filters was used to corroborate air mass trajectory simulations for determination of aerosol source influence. Specifically, filters with an orange color to the particulate matter were determined to have a North African dust influence and were assigned as North African if air mass trajectories were ambiguous.

Results and discussion
Air mass trajectory analyses showed the data set to be comprised of nine marine-influenced (Mar4-12), seven North African-influenced (N Afr9-15), and eight North Americaninfluenced (N Am1-8) aerosol samples (Fig. 1). These classifications are operationally defined and used as just one guideline to help delineate the provenance of aerosol WSOM in each sample. Visual inspection of the filters show each of the North African-influenced samples to have a reddish-orange color demonstrating an influence from desert dust as has been found in previous studies (e.g., Baker et al., 2006) while the remaining samples all showed very little color.
As has been observed in previous studies of negative ion ESI FTICR MS data, using magnitude-weighted abundances increases the importance of CHO and CHOS formulas relative to formulas containing N or P (Wozniak et al., 2008;Sleighter et al., 2009). CHO formulas accounted for the largest portion of the spectral magnitude in 19 of 24 samples, CHOS formulas were most abundant in 3 marine samples (Mar6, Mar7, Mar12), and CHON formulas were most abundant in N Am 6 and N Am 7. The CHOS (12 of 24), CHON (6 of 24), CHO (5 of 24), and CHOP (1 sample) formula groups were all second in magnitude-weighted abundance for various samples (Supplement Table S2).
Of the 11813 molecular formulas, 614 and 248 formulas were present in at least 18 (75 %) and 22 (92 %) of the 24 samples, respectively (Fig. 3). The majority of these formulas were CHO formulas with H / C ratios between 1.05 and 1.75 and O / C ratios between 0.2 and 0.7 or CHOS formulas with H / C ratios between 1.45 and 1.95 and O / C ratios between 0.25 and 0.7. All of these formulas were also found in at least 1 of 18 aerosol WSOM samples isolated from aerosol particulates collected in Virginia and New York, USA (Wozniak et al., 2008;Wozniak, unpublished data), which suggests that these CHO and CHOS compounds are ubiquitous in continental and oceanic atmospheres and not particularly characteristic for WSOM from marine sources. CHO and CHOS formulas with these H / C and O / C ratios are ubiquitous in natural DOM of many types including continental rainwater, fog water, riverine, estuarine, and deep sea DOM (e.g., Sleighter et al., 2008;Mazzoleni et al., 2010;Mitra et al., 2013;Rossel et al., 2013) making assignment of the source of any of these formulas in a given natural OM sample difficult. Their ubiquity amongst marine and terrestrial samples and similar H / C and O / C ratios to those for laboratory-generated SOA (Kourtchev et al., 2014) suggest that they may be SOA products of monoterpene volatile organic compounds.

Principal Component Analysis
An initial PCA (data not shown) performed using the relative spectral magnitude of the 500 most intense formulas found in every sample showed two samples (N Am 6 and N Am 7) with molecular characteristics that are extremely different than the other 22 samples. These two samples were collected very close to Bermuda (< 150 km), the island nation near the western edge of the Sargasso Sea 1000 km from the North American mainland, 1 week after tropical storm Sean passed through the region and had air mass back trajectories characterized as North American-influenced. The unique molecular composition of these two samples may thus be due to the effects of tropical storm Sean creating a turbulent wave environment that emitted marine-derived sea spray aerosol OM into the atmosphere or to a strong North American source of OM. Examination of the molecular formulas in these two samples found 162 molecular formulas present in these two samples that were found in no other sample. The majority (144) of these unique molecular formulas contain N (1-5 nitrogens) and have high H / C (1.6-2.0) and low O / C (0.2-0.4) ratios, characteristics of peptide-like compounds. These samples are discussed in more detail in the Primary Marine sample group section.
To remove the impact of infrequently occurring molecular formulas that drive the PCA scores (following Sleighter et al., 2010), molecular formulas occurring in fewer than 3 samples (< 12 % of the 24 samples) were removed from the PCA loading matrix reducing the number of loading variables from 3077 to 2408. The relative distribution of elements in the formulas is quite different than the complete list of 11813 assigned formulas better reflecting the relative spectral magnitude and the approximate distribution of the elemental formula groups within a given sample. These 2408 formulas are made up of 1172 (49 %) CHO, 592 (25 %) CHOS, 364 (15 %) CHON, 112 (4.7 %) CHOP, 90 (3.7 %) CHONS, 70 (2.9 %) CHONP, and 8 (0.3 %) CHOSP formulas (Fig. 4). The first 3 PCs generated by this PCA account for 51 % of the variance in the data set and separate the 24 samples into 5 distinct groups (Fig. 5a). The sample groups are categorized based on their distinct characteristics as defined by the PCA loadings (the molecular formulas; Fig. 5b) that correspond to a given group's PCA scores and the predominant air mass trajectories for the members of each group. They are (with the identities of samples making up each group listed in parentheses) as follows: (1) Primary Marine (N Am 6,7); (2) Aged Marine (Mar 5, 6, 7, 9, 12); (3) North American-influenced (N Am 1, 2, 3, 4, 5); (4) North Africaninfluenced (N Afr 9, 10, 11, 12, 13, 14, 15); (5) Mixed Source (Mar 4, 8, 10, 11, N Am 8).

Primary Marine OM samples
Even with the removal of infrequently occurring molecular formulas, the PCA separates the two outlier samples, N Am 6 and N Am 7, into their own group highlighting their unique character relative to the other samples in this data set. While the air mass trajectories for N Am 6 and N Am 7 extend back to the North American continent, the WSOM molecular characteristics, specifically the abundance of N and P containing molecular formulas, for these two samples suggest that WSOM has a primary marine biological source. Analysis of the magnitude-weighted average values for all of the PCA assigned sample groups show the Primary Marine group to have higher mean magnitude-weighted percent abundances of CHON (54 %) and CHONP (5.4 %) formula groups than any of the other PCA groups ( Table 1). The CHO and CHOS formula groups, noted above as ubiquitous components in all aerosol WSOM, still comprise quantitatively important portions of the spectral intensity (30 and 7 %, respectively) but smaller portions than the other PCA differentiated groups. Marine phytoplankton and bacteria have higher nutritional requirements for both N and P than do terrestrial vegetation (e.g., Fagerbakke et al., 1996;Hewson and Fuhrmann, 2008;Bianchi and Canuel, 2011), and both dissolved organic nitrogen and phosphorous have long been known to be enriched in the sea-surface microlayer (e.g., Williams, 1967;Williams et al., 1986) where surface active OM components accumulate after being entrained and brought to the surface with bubbles. The N-and P-enriched surface microlayer OM can then be emitted as sea spray, and marine biological components therefore represent a probable source of aerosol WSOM for this set of samples. Zamora et al. (2013), analyzing rainwater samples collected in Barbados, found higher dissolved organic P and N concentrations in sea-spray derived rainwater relative to rainwater influenced by dust or biomass burning. The higher relative contributions of CHON and CHONP containing aerosol WSOM for this group of samples are thus consistent with a sea spray source of primary WSOM.
Patterns observed in the PCA loadings for the molecular formulas defining this sample group provide further evidence for a primary marine source of aerosol WSOM for these 2 samples and important molecular information for potential sea spray WSOM constituents. The PCA loadings defining this group are plotted together in a van Krevelen diagram (Fig. 6a) with the 162 molecular formulas found in these samples but no others (and not included in the PCA). CHON formulas make up 85 % of the formulas in Fig. 6a, and 38 of the remaining 59 formulas are made up by CHONP and CHONS formulas. The CHON formulas found to be defining for this Primary Marine sample group have O / C (0.15 < O / C< 0.45) and H / C (1.5 < H / C < 2.0) ratios that suggest potential peptide-like compounds, and several of the molecular formulas can be attributed to compounds with functionalized amino acids. These CHON molecular formulas form several Kendrick mass defect series with 1 to 4 nitrogen atoms in their molecular formulas (series of formulas differing by CH 2 groups; Kendrick, 1963; Fig. 7) that could represent biological compounds that have been degraded to different degrees in the water or during atmospheric transport. Figure 7 includes examples of potential amino acid containing compounds that correspond to these formulas. It is noted that the structures in Fig. 7 are tentative and represent only some of many potential isomers that correspond to the assigned molecular formulas. LC/MS or a comparable technique is needed to verify the structures of the compounds that correspond to these formulas, but this is beyond the scope of this particular paper. Further, the CHONP molecular formulas defining this sample group have molecular formulas (H / C > 2.0, O / P > 4) consistent with phospholipid-like compounds, important structural components in biological cell membranes. The apparent importance of peptide-like CHON series and the phospholipid-like CHONP formulas are an indication of a primary biological source for these compounds and to the WSOM in general.
Previous studies of marine-derived atmospheric OM have noted contributions from biological components such as amino acid and protein-like compounds Altieri et al., 2012) and carbohydrates and polysaccharide compounds Russell et al., 2010). Though we are unaware of studies that have identified phospholipids in aerosol OM from a sea spray source, they are consistent with a strong biological component of sea spray aerosol OM. It should be noted that carbohydrates and polysaccharide-like compounds previously found to be important components in sea spray aerosol OM Russell et al., 2010) do not ionize efficiently via ESI due to a lack of ionizable polar functional groups and may have thus escaped the analytical window of ESI FTICR MS. The apparent biological contributions to this Primary Marine PCA group lend support for the role of sea surface biological composition as a determining factor in sea spray aerosol OM composition (e.g.; Aller et al., 2005;Schmitt-Koplin et al., 2012;Rinaldi et al., 2013;Gantt et al., 2013) and suggest the need to further explore how oceanographic conditions influence marine aerosol OM composition and impacts in the atmosphere.

Aged Marine samples
A second PCA group comprised of 5 samples with marine air mass back trajectories and distinguished by high contributions from CHOS containing compounds was also identified and is hereafter referred to as the "Aged Marine" samples based on their being characterized by an abundance of Atmos. Chem. Phys., 14, 8419-8434 (Fig. 6b), and O / S ratios greater than or equal to 4, which suggests that many of these formulas represent aliphatic organosulfate compounds. Organosulfate compounds have been identified in several studies examining rainwater and aerosol OM collected in terrestrial (e.g., Romero and Oehme, 2005;Reemtsma et al., 2006;Surratt et al., 2008;Wozniak et al., 2008;Altieri et al., 2009;Schmitt-Koplin et al., 2010) and marine (Claeys et al., 2010; environments. Laboratory studies show organosulfate formation under both light and dark conditions (e.g., Reemtsma et al., 2006;Surratt et al., 2007Surratt et al., , 2008Iinuma et al., 2007;Schmitt-Koplin et al., 2010) with the acid-catalyzed ring opening reaction of epoxides formed from biogenic and anthropogenic precursors described by Iinuma et al. (2007) being the most kinetically favorable formation mechanism studied to date ). Thus, the abundance of organosulfate formulas in these samples can be attributed to secondary aging processes in the atmosphere and are indicative of aerosol WSOM that has been aged considerably relative to that in the Primary Marine samples. Claeys et al. (2010) identified C 6 -C 13 organosulfates in aerosol samples collected on Amsterdam Island in the southern Indian Ocean with higher O / C ratios (0.46-1) than those identified as important to the Aged Marine samples here (0.25-0.5). The different O / C ratios likely relate to differences in the analytical windows of the techniques used in the two studies. Organosulfates in the Aged Marine group PCA 8428 A. S. Wozniak et al.: Distinguishing molecular characteristics of aerosol water soluble organic matter loadings had negative ion nominal masses between 209 and 617 and contained 7-29 C atoms, while Claeys et al. (2010) detected negative ion masses up to 309 (≤ 13 C atoms). The higher carbon chain lengths able to be detected here facilitates the detection of lower O / C ratio organosulfates, and they appear here as a defining characteristic for these Aged Marine samples. Sulfate is ubiquitous in the troposphere with strong marine and combustion sources, and the presence of the sulfate itself is not indicative of a particular source, however, the marine air mass trajectories and H / C and O / C ratios observed in these samples relative to those in samples with continentally influenced air mass trajectories are strong evidence for a marine aerosol WSOM source.
Magnitude-weighted average H / C and O / C ratios for the Aged Marine (H / C = 1.57, O / C = 0.36, Table 1) group were slightly lower and higher, respectively, compared to those found for the Primary Marine (H / C = 1.66, O / C = 0.32, Table 1) group consistent with the Aged Marine samples having been processed in the atmosphere. Several studies have shown aerosol OM to increase in O content and decrease in H / C with photochemical degradation (e.g., Heald et al., 2010;Kroll et al., 2011;Sato et al., 2012). The Primary Marine and Aged Marine WSOM groups are distinguished from the continentally influenced groups by these relatively high H / C and low O / C magnitudeweighted average values, indicative of a more aliphatic, less oxidized marine WSOM. A recent study of rainwater DOM similarly showed marine-derived dissolved organic nitrogen compounds to have lower O / C ratios than continental influenced dissolved organic nitrogen compounds (Altieri et al., 2012). Schmitt-Koplin et al. (2012) demonstrated that bubbles that burst from sea water are enriched in DOM compounds with higher H / C and lower O / C ratios relative to that found in the surface water DOM. Aliphatic compounds with intermediate oxygen content appear to be the surfaceactive component within marine DOM that is emitted with sea spray aerosol, and their enrichment relative to the highly polar, more aliphatic WSOM signature typically found in continental and continentally influenced aerosol WSOM are clear indicators of marine-derived aerosol WSOM. Our data further separates marine-derived aerosol WSOM into an Nrich peptide-like component that we propose represents a primary marine WSOM and a slightly more oxidized, S-rich organosulfur component that we propose represents aged marine WSOM.

North American-influenced samples
The North American-influenced sample group was dominated by CHOS PCA loadings with higher O / C ratios than those for the Aged Marine group. The PCA loadings for this group show 134 and 36 of the 269 formulas to be CHOS and CHONS containing formulas, respectively. These S containing formulas have similar H / C range but a higher O / C range (1.3 < H / C < 2.0; 0.5 < O / C < 0.9) than those for the Aged Marine group (Fig. 6b and c). The majority (87 of 269) of the remaining formulas defining this group were CHO formulas plotting at H / C ratios between 1.0 and 1.6 and O / C ratios between 0.4 and 0.75 (Fig. 6c). The mean magnitudeweighted O / C ratio for these samples is 0.50, the highest of any of the sample groups, while the mean magnitudeweighted H / C ratio is 1.46, lower than the two marine groups but higher than the North African-influenced sample group. Like the two marine groups, heteroatomic molecular formulas are important components of these 6 samples, with CHOS and CHON formulas accounting for 26 and 14 % on average of the spectral intensity, respectively (Table 1).
Their high O / C ratios relative to the other sample groups suggests that these North American-influenced samples are characterized by a high degree of oxidation similar to the humic like substances (HULIS) described for many North American and European continental samples (e.g., Salma et al., 2010;Pavlovic and Hopke, 2012;Paglione et al., 2014) and reflects high contributions from sulfate, nitrate, and carboxylic acid functional groups. Highly oxygenated atmospheric HULIS has been attributed to SOA from various biogenic volatile precursors (e.g., Shapiro et al., 2009;Stone et al., 2009), fossil fuel and biomass combustion byproducts (e.g., Decesari et al., 2002;Stone et al., 2009;Salma et al., 2010;Pavlovic and Hopke, 2012), and the subsequent photochemical aging of these materials (e.g., Decesari et al., 2002;Pavlovic and Hopke, 2012). The North American-influenced samples (N Am 1-5) included in this group were those closest to the continent (Fig. 1) where SOA formation and fossil fuel/biomass combustion emissions are highest, and we conclude that the importance of these high O / C CHO and CHOS compounds resulted from strong contributions from the North American continent to the WSOM in this group of samples.
The O / S ratios for the CHOS and CHONS formulas identified as important to these North American-influenced samples are consistent with those of organosulfate and nitrooxy organosulfate compounds (O / S > 4) formed as discussed in the Aged Marine section. The major difference between the organosulfates in both the North American-influenced samples is the higher O / C ratio which likely relates to the organosulfate precursor compounds for these two sources.
The O / C ratios for CHO formulas (which represent potential organosulfate precursors) in the North American-influenced samples (Fig. 6c) are considerably higher than those observed for the aged and Mixed Source samples (Fig. 6b and e). If these organosulfate compounds result from the acid-catalyzed reaction of H 2 SO 4 with alcohol and carbonyl groups in these marine and North American-influenced CHO compounds of differing O / C ratios, then the organosulfate products logically have differing O / C ratios. It thus appears that organosulfates from marine and continental combustion sources are molecularly distinct.

North African-influenced samples
The North African-influenced group PCA loadings were dominated by contributions from CHO molecular formulas which represented 361 of the 408 molecular formulas defining the sample group. The North African-influenced sample loadings were defined by having high O / C ratios (0.3 < O / C < 0.8) relative to the marine groups and H / C ratios between 0.8 and 1.70 (Fig. 6d). Average magnitudeweighted abundances confirm that CHO formulas dominated the aerosol WSOM representing 76 % of the spectral intensity (Table 1). This sample group also had the least aliphatic character as evidenced by having the lowest magnitudeweighted H / C ratio (1.40) and highest magnitude-weighted average AI mod value (0.18, Table 1). Despite the high AI mod value for the North African-influenced samples, the WSOM compounds in these samples show little spectral contributions from formulas that could be considered black carbon, the highly condensed hydrocarbon byproduct of fossil fuel and biomass combustion. The areas of the van Krevelen diagram occupied by the molecular formulas characterizing the North African-influenced samples are in the area where lignin molecules would plot (e.g., Sleighter and Hatcher, 2007). A study of dust deposited to a buoy in the northeast Atlantic off the coast of northwest Africa found lignin phenols to be a significant component of the dust OM further characterizing the lignin to have been highly altered and derived from non-woody angiosperms based on acid / aldehyde and cinnamyl / vanillyl and syringyl / vanillyl ratios (Eglinton et al., 2002). The CHO compounds accounting for the molecular formulas in Fig. 6d are also thus suggestive of a lignin source that has been highly altered via diagenesis in desert soils.
Desert dust contains OM that is highly processed within the soil and as a result has less aliphatic character. It also has less influence from anthropogenic N and S sources which explains the lesser contributions from heteroatom containing formulas for the North African-influenced samples vs. the North American-influenced PCA group. Organosulfate formation is known to occur under acidic conditions such as those created by the emissions of nitrates and sulfates during fossil fuel combustion and was shown by Schmitt-Koplin et al. (2010) to be more efficient for compounds with higher H / C ratios. The lack of compounds with high H / C ratios further explains the relatively low contributions from CHOS compounds in these samples.
Compared to European and North American-influenced samples, aerosol OM influenced by Saharan dust is severely understudied, and considerably less is known about its detailed molecular composition. The data presented here demonstrate that samples with clear Saharan influence show a predominance of CHO aerosol WSOM compounds with less aliphatic character than typically found for aerosol WSOM influenced by the North American continent (e.g., Fig. 6c and d; Wozniak et al., 2008) which experiences con-siderably more combustion influence known to both emit fossil fuel OM and stimulate secondary organic aerosol formation (e.g., de Gouw et al., 2008;Carlton et al., 2010). Fossil fuel combustion and secondary organic sources common to North American aerosol OM are comparatively minimal in North African source regions, and this is reflected in the differences in the North American-influenced and North African-influenced PCA group molecular characteristics. The defining molecular characteristics of the North African-influenced sample group are consistent with those of soil OM that has been highly degraded and is perhaps rich in aromatic compounds characteristic of oxidized soil humic material (Tan, 2003) and transported with Saharan dust as it is picked up with strong winds.

Mixed Source samples
The remaining five aerosol samples which form the Mixed Source sample group showed molecular characteristics reflecting a mixture of marine and continental sources. The magnitude-weighted O / C, H / C, AI mod , and % contributions from CHO, CHON, CHOS, and CHONS formulas for the Mixed Source samples are all intermediate relative to the other sample groups (Table 1). The van Krevelen diagram for the PCA loadings for the Mixed Source samples (Fig. 6e) are the most diverse in terms of elemental constituents, but CHO (51 %), CHOS (20 %), and CHOP (12 %) formulas are the most abundant. The CHO formulas defining the Mixed Source samples have a higher O / C range (0.15-0.5) than the Aged Marine sample group ( Fig. 6b; O / C ∼ 0.15-0.4) but do not extend to values as high as those defining the North American-influenced sample group ( Fig. 6c; O / C ∼ 0.5-0.9). The CHOS formulas defining the Mixed Source sample group show O / C ratios similar to both the Aged Marine and North American-influenced sample groups.
The diverse PCA loadings and intermediate magnitudeweighted average molecular characteristics suggest that these samples do not have a singular dominant source to their aerosol WSOM but that they are influenced by both marine and continental sources. The Mixed Source samples consisted of four samples with marine air mass trajectories and one sample with a North American air mass trajectory, and of course, all five samples were collected over the marine environment. The lack of a dominant WSOM source to these samples and the location of their collection over the middle of the ocean suggests a weak marine source to these samples and demonstrates the long range (in distance and time) transport of continentally derived WSOM.

Summary and implications
The analytical approach and transatlantic coverage in this study allowed the definition and confirmation of several source-specific aerosol WSOM characteristics that add to the body of literature obtained in marine aerosol OM studies using other techniques and/or sampling strategies. Although samples from similar air mass trajectories frequently showed similar molecular characteristics, air mass trajectories were not sufficient in describing the provenances of aerosol WSOM. PCA enabled the distinction of three types of marine aerosols defined by the degree to which they have been processed post-emission and their dilution with continental aerosol WSOM (Primary Marine, Aged Marine, and Mixed Source) and identified two samples with North American back trajectories as having primary marine sourced aerosol WSOM. These differences in air mass trajectory and PCA defined WSOM source illustrate the need to utilize multiple lines of evidence for determining aerosol OM provenance.
The PCA and FTICR MS results demonstrate that for samples collected over the ocean, aerosol WSOM having marine sources has lower O / C and higher H / C ratios than continentally influenced WSOM, properties that derive from the characteristics of sea surface DOM and the bubble bursting process that emits that DOM as marine aerosols. The bubble bursting process known to produce sea spray aerosols has been shown in laboratory experiments to emit the more aliphatic (higher H / C) and less oxygenated (lower O / C) components within sea surface DOM (Schmitt-Koplin et al., 2012). The marine-derived aerosol WSOM was further differentiated in this study to describe samples that have been processed to different degrees in the atmosphere. The Primary Marine sample group showed molecular characteristics indicative of a marine biological component as demonstrated by the peptide-like and phospholipid-like CHON and CHONP containing molecular formulas present in the WSOM. The Aged Marine sample group is indicative of aerosol WSOM with a marine source that has been processed in the atmosphere as evidenced by the aliphatic (relative to the North American-influenced samples) organosulfates that characterize these samples, and the Mixed Source sample group appears to show samples with WSOM from a mixture of aged, primary, and continental sources.
Among the continentally influenced samples, the PCA clearly demonstrates the North American-influenced samples to have higher H / C and O / C ratios and heteroatomic (N, S) content than the North African-influenced samples. The higher N and S content and O / C ratios are likely due to the North American-influenced samples being oxidized in the atmosphere in the presence of combustion-emitted nitrates and sulfates in this more industrialized region. The lower H / C ratios observed for the North African-influenced samples may be more of an indication of a lack of high H / C ratio compounds emitted from desert environments than a lack of low H / C ratio compounds emitted from North America and Europe. Combustion processes are well known to emit condensed aromatic compounds including black carbon, and tens to hundreds of condensed aromatic formulas were found in each North American-influenced sample (data not shown).
It is likely, however, that the aliphatic, highly oxygenated compounds in the North American-influenced samples dominated the signal in the electrospray source keeping these more condensed compounds at low abundance and therefore not considered in the PCA. Aerosols from the North African region, highly influenced by the Saharan desert as evidenced by their orange coloration, carry OM from desert soils that have been subjected to degradation processes resulting in the loss of the aliphatic components of biomass. As a result, the lower H / C ratios in these samples are observed.
Though direct study of the samples' physical properties were not conducted, the defining molecular characteristics described here suggest some important source-specific environmental implications. UV and visible light absorption result from electron transitions, mainly π → π * and n → π * transitions associated with C = C, C = O, C-O, and aromatic rings (Andreae and Gelencsér, 2006), and the higher average H / C ratios and lower average AI mod values identified for the two marine aerosol WSOM sample groups suggest that they have a lesser radiative impact on a per carbon basis than aerosol WSOM from continental environments. This distinction between natural marine and continental (natural or anthropogenic) aerosol WSOM is an important factor to be considered in climate models attributing the direct effect on climate to various sources. Aerosol hygroscopicity has been related to cloud condensation nuclei formation with increased hygroscopicity leading, to a higher indirect effect on climate via cloud formation. The higher O / C content observed in the continental aerosol WSOM similarly suggests that continental aerosols have a higher impact on hygroscopicity, which has been linked to oxygen content and climate, on a per carbon basis. The aerosol WSOM characteristics described here for this transatlantic transect of aerosol samples, therefore, provides important potential source-specific differences in climate relevant properties. Future work must link molecular characteristics and physical properties on aerosols collected at the same time to confirm these indications.
Finally, ESI FTICR MS, as demonstrated in this study, is a powerful technique providing extensive aerosol WSOM molecular characterization, but it does have a specific analytical window, and a great many compounds present in aerosol WSOM elude detection using this technique. For example, carbohydrates have been identified as a major constituent of primary marine aerosol OM but do not ionize well in the ESI source. As such, the current work is not comprehensive and should be viewed (as should all studies of OM characterization) in the context of its methodological approach. With that in mind, the WSOM FTICR MS data presented here make a significant contribution to the characterization of aerosol OM in remote marine aerosols influenced by continental sources to varying degrees. Future work combining online and offline OM characterization techniques will enable the comprehensive characterization needed to fully understand the link between OM sources