A global aerosol classification algorithm incorporating multiple satellite data sets of aerosol and trace gas abundances

Detecting the optical properties of aerosols using passive satellite-borne measurements alone is a difficult task due to the broadband effect of aerosols on the measured spectra and the influences of surface and cloud reflection. We present another approach to determine aerosol type, namely by studying the relationship of aerosol optical depth (AOD) with trace gas abundance, aerosol absorption, and mean aerosol size. Our new Global Aerosol Classification Algorithm, GACA, examines relationships between aerosol properties (AOD and extinction Ångström exponent from the Moderate Resolution Imaging Spectroradiometer (MODIS), UV Aerosol Index from the second Global Ozone Monitoring Experiment, GOME-2) and trace gas column densities (NO2, HCHO, SO2 from GOME-2, and CO from MOPITT, the Measurements of Pollution in the Troposphere instrument) on a monthly mean basis. First, aerosol types are separated based on size (Ångström exponent) and absorption (UV Aerosol Index), then the dominating sources are identified based on mean trace gas columns and their correlation with AOD. In this way, global maps of dominant aerosol type and main source type are constructed for each season and compared with maps of aerosol composition from the global MACC (Monitoring Atmospheric Composition and Climate) model. Although GACA cannot correctly characterize transported or mixed aerosols, GACA and MACC show good agreement regarding the global seasonal cycle, particularly for urban/industrial aerosols. The seasonal cycles of both aerosol type and source are also studied in more detail for selected 5×5 regions. Again, good agreement between GACA and MACC is found for all regions, but some systematic differences become apparent: the variability of aerosol composition (yearly and/or seasonal) is often not well captured by MACC, the amount of mineral dust outside of the dust belt appears to be overestimated, and the abundance of secondary organic aerosols is underestimated in comparison with GACA. Whereas the presented study is of exploratory nature, we show that the developed algorithm is well suited to evaluate climate and atmospheric composition models by including aerosol type and source obtained from measurements into the comparison, instead of focusing on a single parameter, e.g., AOD. The approach could be adapted to constrain the mix of aerosol types during the process of a combined data assimilation of aerosol and trace gas observations.


Introduction
Measurements of aerosol optical depth (AOD) -by groundbased, airborne, and satellite-borne instruments -have provided us with a good picture of the highly variable distribution of aerosols throughout the globe.The uncertainties in our knowledge of the global distribution of aerosol loading have become progressively smaller during the past decade owing to dedicated satellite-borne aerosol instruments like the Moderate Resolution Imaging Spectroradiometer and the Multi-angle Imaging Spectroradiometer (MODIS and MISR; see e.g., Remer et al., 2005;Kahn et al., 2005;Kokhanovsky and de Leeuw, 2009;Chin et al., 2014, and references therein).However, for many applications the aerosol amount tells only half of the story: to study the interaction between aerosols and clouds (Rosenfeld et al., 2014), to determine aerosol radiative effects, and for the development of mitigation strategies it is crucial to additionally know the aerosol type or source (e.g., IPCC, 2013).For remote sensing retrievals themselves, aerosol optical properties or some constraints on particle type are also needed to aid model selection in the inversion process.
The contribution of aerosols to the top-of-atmosphere radiance detected by satellite instruments is spectrally smooth, and due to the interfering signal from the surface, passive radiometers like MODIS cannot retrieve more than one or two pieces of information from their measurements: AOD and the extinction Ångström exponent (EAE).The EAE, as a proxy for the particle size distribution, turns out to be a very useful metric when characterizing aerosol types.Naturally emitted primary aerosols, such as mineral dust and sea salt, consist of relatively large particles with a size distribution centered at sizes > 1 µm.In contrast, secondary aerosols -those formed from components emitted in gaseous form -are generally (much) smaller than 1 µm (i.e., the extinction is almost entirely due to small particles; Seinfeld and Pandis, 2006).The majority of such "fine" particles is often assumed to be of anthropogenic origin (Kaufman et al., 2002), although biomass burning aerosols, which consist mostly of fine particles, are not all human-induced.In addition, there are strong biogenic sources of small secondary organic aerosols.To further discriminate between aerosol types, differences in absorption can be exploited (as e.g., in Higurashi and Nakajima, 2002;Jeong and Li, 2005;Kim et al., 2007;Mielonen et al., 2009).This allows for the distinction of desert dust (large particles that absorb in the UV range) from sea salt (large, but non-absorbing), and smoke (small, absorbing) from industrial pollution (small, weakly or non-absorbing), for example.In practice, such simple rules are often violated: aging of particles (hygroscopic growth, coating or other processes) or mixing of different aerosol types change the optical properties.To determine the (most probable) main aerosol source, more information is required.We use measurements of trace gas abundances as a source of this information.
Apart from naturally formed particles (desert dust and sea salt), aerosols are often accompanied by enhanced trace gas levels -because they were emitted by the same source, or were formed from those trace gases or from the same precursor.Hence, collocated measurements of trace gases can be used to determine the main source of aerosols.This has been exploited in a study by Veefkind et al. (2011), in which it was shown that the presence of significant correlation of AOD with trace gas concentrations, notably NO 2 and HCHO, is an indication of the main source of those aerosols.In a later publication, also involving data from the Ozone Monitoring Instrument (OMI), Torres et al. (2013) demonstrated that the use of CO data from the Atmospheric Infrared Sounder (AIRS) to identify smoke improves the aerosol retrieval by OMI.In the present study, we take these findings a step further and integrate them into an algorithm to determine the main aerosol type and its source on a global scale.We extend the analysis initiated by Veefkind et al. (2011) by adding CO abundance and aerosol optical properties.The resulting Global Aerosol Classification Algorithm, GACA, combines the EAE from MODIS and UV Aerosol Index (UVAI) from GOME-2 (Global Ozone Monitoring Experiment-2) to determine an aerosol type based on its size and absorption.Subsequently, trace gas vertical column densities (VCDs of NO 2 , HCHO, SO 2 , and CO) are used to infer the dominating source of the aerosols.The main results from this algorithm are seasonal maps that show the dominating aerosol type and source at 1 • × 1 • or 2 • × 2 • resolution, respectively.GACA results are compared to aerosol composition from MACC (Monitoring Atmospheric Composition and Climate) reanalysis data on a global and regional scale.The MACC project provides data on atmospheric composition for the recent past and makes midterm forecasts by combining stateof-the-art atmospheric modeling with satellite-based measurements (e.g., Inness et al., 2013).The model assimilates AOD from both MODIS instruments, using it to scale the total aerosol mixing ratio.The tropospheric aerosol types (or components) included in MACC are sea salt, desert dust, organic matter, black carbon, and sulfate.The comparison with model data highlights an important application of our algorithm: the improvement of emissions of both trace gases and aerosols in models (as suggested in e.g., Xu et al., 2013).
In this paper we present GACA and demonstrate its capabilities with seasonal global maps of aerosol type and main source, seasonal cycles of aerosol type and source in six selected regions, and several other applications.We find good agreement between results from GACA and MACC reanalysis in most cases; some important discrepancies between the data sets are discussed.The paper is structured as follows: first, we describe the instruments and data sets used in GACA.The algorithm is described in detail in Sect.3. Global maps of aerosol type and aerosol source determined by GACA are presented and compared with maps of aerosol composition from the MACC reanalysis in Sect.4, where the study of the seasonal cycle in six study regions is also shown.In Sect. 5 the sensitivity of GACA to various parameters is discussed, GACA results are compared to existing aerosol climatologies, and future improvements to the algorithm are suggested; the closing Sect.6 contains our concluding remarks.

Satellite instruments
There are two MODIS instruments in operation: one each on NASA's Aqua and Terra satellites.Designed to detect aerosols, the MODIS instruments measure reflectances in 36 wavelength bands at high spatial resolution (of the order of 1 km 2 or less) with a swath wide enough (2600 km) to provide daily global coverage (Justice and Townshend, 2002, and references therein) The MOPITT instrument (Pan et al., 1998) is also part of the payload of the Terra satellite.MOPITT pixels measure 22 km × 22 km and the swath of the instrument is 640 km; hence, global coverage is reached approximately every 3 days.
GOME-2 on MetOp-A is a spectrometer that measures backscattered radiance in the UV-NIR range (240-790 nm) with a nominal spatial resolution of 40 km × 80 km (Callies et al., 2000).The swath width of the GOME-2 instrument is 1920 km, permitting global coverage in 1.5 days.MetOp-A was launched in 2006 into a daytime-descending orbit with a local Equator crossing time of 09:30 LT.

Data sets
The data sets that are used as input to GACA are briefly introduced in this section; for details we refer to the literature and websites listed in Table 1.

Aerosol optical depth and extinction Ångström exponent
Monthly mean values of AOD (or τ ) from MODIS collection 5.1 were obtained at 1 • × 1 • resolution.The retrieval algorithms for aerosols over ocean and dark land are described in Remer et al. (2005) and Levy et al. (2007b), respectively.For bright surfaces where no AOD value is available from the dark target algorithm (mainly deserts), the Deep Blue product (Hsu et al., 2004) is used.In this study, data from the MODIS instrument on Aqua are used -despite the better agreement of the overpass times of GOME-2 and Terrabecause the Deep Blue data set of Terra reaches only up to 2007 due to a missing polarization correction.The Level-3 reprocessing of collection 6, in which the calibration of both MODIS instruments is improved and several other algorithm updates have been made (Levy et al., 2013;Lyapustin et al., 2014), is incomplete at the time of writing.MODIS AOD is given at 550 nm.Monthly mean EAE (α) is calculated according to Eq. ( 1) from the mean MODIS AOD: with τ λ as the monthly mean AOD at the wavelengths λ 1 = 470 nm and λ 2 = 660 nm.Those are the only two channels for which AOD is determined for land, ocean, and bright surfaces.The EAE was chosen over the fine-mode fraction (FMF) because FMF is not part of the Deep Blue aerosol product, thus no aerosol size information would be available over deserts and other bright surfaces.A more detailed discussion of EAE and FMF appears in Sect.5.2.

UV Aerosol Index
The UVAI is a semi-quantitative indicator of aerosols.Positive values of UVAI are generally referred to as "Absorbing Aerosol Index (AAI)", which is a measure of aerosols that absorb UV radiation (Torres et al., 1998;de Graaf et al., 2005).For UVAI < 0, which can be used for the detection of non-absorbing aerosols (Penning de Vries et al., 2009), the term "SCattering Index (SCI)" was suggested.The UVAI is a complex function of AOD, aerosol absorption, and layer altitude, and using it in a quantitative sense is not straightforward.However, in combination with auxiliary information on aerosol abundance (i.e., AOD), information on aerosol absorption can be derived from UVAI.Although "AAI" is more often used in literature, we prefer to use the term "UVAI", as we use both the positive and negative values of the Aerosol Index.Level-2 operational UVAI (determined using 340 and 380 nm GOME-2 reflectances) from the O3M SAF (Satellite Application Facility for Atmospheric Composition and UV Radiation; o3msaf.fmi.fi) were obtained from the Tropospheric Emission Monitoring Internet Service (TEMIS); a description of the algorithm can be found in de Graaf et al. (2005Graaf et al. ( , 2014)).The UVAI were corrected for the effects of instrument degradation using empirically derived inflight reflection correction factors (Tilstra et al., 2012).The data were filtered for sunglint, single scattering angles smaller than 90 • , and solar eclipses, as recommended in the "ATBD for the GOME-2 Aerosol products" by de Graaf et al. (2014).In addition, data with FRESCO (Fast Retrieval Scheme for Clouds from the Oxygen A band) effective cloud fractions (Wang et al., 2008) exceeding 0.2 or solar zenith angle (SZA) over 80 • were discarded prior to gridding and averaging to comply with the data selection of the trace gases measured by GOME-2 (see next section).

Trace gases
Total column densities of SO 2 and HCHO, and tropospheric column densities of NO 2 are retrieved by DOAS analysis (Differential Optical Absorption Spectroscopy; see e.g., Platt and Stutz, 2008;Richter and Wagner, 2011) of GOME-2 spectra in the UV-visible range.
For our study, TM4NO2A version 2.1 Level-2 NO 2 tropospheric VCDs were obtained from TEMIS.The retrieval of NO 2 from GOME, similarly applied to GOME-2, is described in Boersma et al. (2004).
Our retrieval of GOME-2 SO 2 data is described in detail in Hörmann et al. (2013).It takes into account non-linear effects that may occur for high SO 2 concentrations.
All GOME-2 trace gas data were filtered by FRESCO cloud fraction (CF < 0.2, unless stated otherwise) and Monthly mean, gridded version-6 MOPITT CO total VCDs were obtained from the Atmospheric Science Data Center (ASDC).We used results from the combined nearand thermal-infrared (NIR-TIR) retrieval because combination of the two spectral regions greatly improves the sensitivity to the lower troposphere (Deeter et al., 2003(Deeter et al., , 2013)).A recent validation of the NIR-TIR algorithm found relatively large random retrieval errors and bias drift (Deeter et al., 2013), but these are not expected to significantly influence our results for two reasons: first, we use monthly mean data on a coarse 1 • × 1 • grid which reduces random errors; and, second, we use the excess CO (value minus background) instead of the absolute value, which should remove a timedependent bias.The total excess CO column used here (denoted as CO) is obtained by subtracting a background column that is the median of the data within each 5 • latitude band.This procedure is needed due to the long lifetime of CO and allows using a single CO threshold value throughout the year and for the whole globe.

MACC model data
The MACC reanalysis was developed and produced during the series of EU-funded GEMS (Global and regional Earthsystem (Atmosphere) Monitoring using Satellite and in situ data), MACC and MACC-II (MACC-Interim Implementation) projects.These projects developed the operational Copernicus Atmosphere Monitoring Services (CAMS), which was launched in November 2014.It delivers global atmospheric composition analyses and forecasts and European air quality forecasts every day.While the main developments were aimed at real-time production, periodic reanalyses have been planned from the outset to provide consistent time series for various scientific applications (Hollingsworth et al., 2008, www.copernicus-atmosphere.eu).The aerosol model is integrated into the European Centre for Medium-Range Weather Forecasts (ECMWF) Integrated Forecasting System (IFS) for numerical weather predictions and uses the total aerosol mixing ratio as a control variable.Five types of tropospheric aerosols are included: sea salt, desert dust, organic matter, black carbon, and sulfate.Aerosols of natural origin (sea salt and desert dust) are related to model parameters (wind speed and soil moisture), whereas anthropogenic aerosol emissions come from inventories (Morcrette et al., 2009).In particular, biomass burning emissions are distributed with 0.5 • and 1-day resolution according to GFASv1.0 (Global Fire Assimilation System; Kaiser et al., 2012), with monthly budgets before 2009 scaled to GFED3.0 (Global Fire Emissions Database; van der Werf et al., 2010).The aerosol assimilation system uses AOD from both MODIS sensors at the time and location of overpass to scale the total aerosol abundance, while retaining the fractional contribution of each aerosol component to the total mass (Benedetti et al., 2009).

Global Aerosol Classification Algorithm description
GACA is based on the outcome of several tests applied to the trace gas and aerosol data described in the previous section, and their correlation with AOD.The algorithm consists of two main parts: the first part, named GACA-type, assigns certain aerosol types to each data point within a grid box based on UVAI (a measure of aerosol absorption) and EAE (a measure of aerosol size).The second part, GACA-source, relates trace gas abundance to the different aerosol types and assigns the most probable aerosol source to each grid box.Both parts will be described in detail in Sects.3.2 and 3.3, and are summarized in the decision tree in Fig. 2.

Data selection
Prior to analysis, GACA performs a selection of data for each "grid box".For a final map with a resolution of 2 • × 2 • (which was chosen as a compromise between spatial resolution and statistics), each grid box on the globe contains four data points per month, because the input monthly mean maps (of AOD, UVAI, EAE, and trace gas column densities) have a resolution of 1 • × 1 • .To improve statistics and stability of the algorithm, the data are grouped by season and 5 years of data (2007)(2008)(2009)(2010)(2011) are combined, increasing and size (EAE).Left, aerosol types color-coded according to size (larger sizes have darker hues) and absorption (non-absorbing in blue, neutral in green, absorbing in red): LA, large absorbing; MA, medium-size absorbing; SA, small absorbing; LN, large, neutral; MN, medium-size, neutral; SN, small, neutral; LNA, large, nonabsorbing; MNA, medium-size, non-absorbing; SNA, small, nonabsorbing.Right, monthly mean UVAI and EAE within grid boxes in regions dominated by desert dust (red dots), biomass burning smoke (gray crosses), secondary biogenic aerosols (green circles), and sea salt (light blue pluses).Data are from June-August 2007-2011; see the text for the selected geographical regions.
the number of data points to 60. Grid boxes in which the monthly mean AOD never exceeds 0.05 are removed, as it is assumed that they cannot be reliably classified.The obtained data set is screened for missing values and outliers; the latter because the intention is to build a climatology of typical conditions, which should not be influenced by exceptional events.In addition, faulty retrievals (e.g., due to the South Atlantic Anomaly) are removed.Outliers are removed by repeated exclusion of data points exceeding the mean-plus-3σ criterion until all data fall within the 3σ range.Whenever an AOD, EAE or UVAI outlier is encountered, all corresponding values (collocated AOD, UVAI, EAE, and trace gas columns) are removed from the data set.Trace gas outliers are also excluded, but in this case only the affected data point is removed.Hence, if an NO 2 outlier is encountered, the NO 2 value is removed, but HCHO, SO 2 , and CO columns and aerosol data are retained (i.e., in this case the mean NO 2 VCD is calculated with one data point less than the means of the other trace gases and aerosol data; the same applies to the calculation of the correlation with AOD).If outliers are not removed from the data set, GACA results are not strongly affected, but the effects of local extreme events (fires, volcanic eruptions) become apparent.This is discussed in more detail in Sect.5.1.

Aerosol type classification by GACA-type
Each point of the filtered data set is subsequently assigned one of nine aerosol types based on its UVAI and EAE values.In this study, aerosol types are defined by their size -small (S), medium (M), and large (L) -and the amount of aerosol absorption in the UV range -non-absorbing (NA), neutral (N), or absorbing (A) -as shown in the left panel of Fig. 1.The acronyms of aerosol types and sources are explained in Table 2.
The choice of UVAI and EAE thresholds is motivated by the right panel of Fig. 1, which displays monthly mean data (June-August 2007-2011) from regions which we assume to be dominated by one of four aerosol sources: mineral dust (14-26 • N, 16 • W-8 • E), smoke (4-16 • S, 14-30 • E), biogenic secondary organic aerosols (30-36 • N, 80-90 • W), and sea salt (0-10 • S, 120-140 • W).The depicted aerosols are clearly separated by the EAE thresholds (sea salt from secondary organic aerosols; desert dust from smoke) and the UVAI thresholds (desert dust from sea salt; smoke from secondary organic aerosols).The choice of nine aerosol types instead of four (like in Higurashi and Nakajima, 2002) was motivated by the occurrence of situations where different particle types are mixed.
For each 2 • × 2 • grid box, the fraction of data points belonging to each aerosol type is computed and the most frequently observed type, weighted by AOD, is assumed to be the dominant type.Note that if the type classification is run on its own (i.e., not as input for the aerosol source assignment step), the statistics requirements are less strict and global maps can be produced on 1 • × 1 • resolution (e.g., Fig. 4).

Aerosol source assignment by GACA-source
The results from GACA-type are used as input for the second part of GACA: the determination of the dominant aerosol source.The main assumption underlying GACA-source is that enhancements in trace gas and aerosol abundance are caused by the same source.The algorithm computes means over all data points within a grid box (of AOD, UVAI, and trace gas VCDs) and correlations between AOD on the one hand, and UVAI and trace gas VCDs on the other.Together with the dominant aerosol type determined in the previous step, these data are used to assign a main aerosol source based on the outcome of two types of tests: (1) is the mean trace gas abundance or HCHO : NO 2 ratio above the threshold given in Table 3? (2) Is there a linear correlation (with R 2 > 0.25) between AOD and UVAI or AOD and trace gas abundance?An overview of GACA-source can be found in the lower part of the decision tree in Fig. 2. Eight aerosol sources are discriminated in GACA-source: biomass burning smoke, desert dust, secondary biogenic, secondary urban/industrial, aged, volcanic sulfate, sea salt, and unknown sources.Each source and the selected classification criteria will be described in more detail in the following sections.

Biomass burning smoke (BB)
Fresh smoke from forest, agricultural, or grassland fires mainly consists of small particles (e.g., Dubovik et al., 2002;Eck et al., 2013) that absorb light in the UV and visible range.
At least one of:  3. The mean value of a quantity, e.g., CO, is denoted CO; the coefficient of correlation between AOD and a quantity, e.g., HCHO, is denoted R 2 (HCHO).Thresholds are denoted as e.g., SO 2,thresh , R 2 thresh , ratio thresh (for the HCHO : NO 2 ratio threshold), or AOD SS-thresh (for the maximum AOD allowed for SS classification).Other abbreviations are explained in Table 2.
Co-emitted trace gases are NO 2 , HCHO and CO, as well as SO 2 but only in very small amounts (Andreae and Merlet, 2001).In GACA-source, grid boxes are always designated BB when the main type is small absorbing.Biomass burning is also assigned if the absorbing aerosol criterion is fulfilled and either (1) mean CO or (2) correlation between CO and AOD or (3) mean HCHO and correlation between HCHO and AOD pass the threshold.The absorbing aerosol criterion requires that either (a) the dominant aerosol type is absorbing or (b) the dominant type is neutral and a good correlation with a positive slope is found for UVAI and AOD, and mean AOD ≥ 0.15.This allows grid boxes with relatively small UVAI (e.g., due to lower-lying aerosol layers or cloud contamination) to be designated as BB.

Desert dust (DD)
Mineral dust consists of large, non-spherical particles that absorb UV radiation due mainly to their iron oxide content (Sokolik and Toon, 1999).The emission and transport of DD is linked to meteorology (i.e., wind fields) and land surface conditions and not to trace gas emissions.GACAtype assigns DD as a source to grid boxes that are dominated by large absorbing aerosols -unless they were already characterized as BB.To include aged DD plumes, medium-size and large neutral aerosol types can be attributed to DD if the absorbing aerosol criterion is fulfilled (see above) but, additionally, the correlation of CO and AOD and means of the other trace gases (NO 2 , HCHO, and SO 2 ) should be below their respective threshold values.The latter criterion serves to distinguish DD from BB and volcanic ash but as a negative side effect excludes polluted dust and cases of mixed desert dust and smoke.

Secondary aerosols biogenic origin (BIO)
The small, non-absorbing aerosols that form by condensation of (semi-)volatile biogenic precursors are accompanied by enhanced levels of HCHO, as both are products of the oxidation of isoprene and other volatile organic compounds (Seinfeld and Pandis, 2006;Goldstein et al., 2009;Stavrakou et al., 2009).To separate them from urban/industrial aerosols, the ratio of HCHO/NO 2 is required to be above a certain threshold value (given in Table 3).

Secondary aerosols of urban/industrial origin (URB)
Due to the diversity of sources and chemical processing in industrialized environments, the URB source is very broadly defined in GACA-source.All grid boxes dominated by nonabsorbing or neutral aerosol types that have enhanced NO 2 columns qualify.The only exception being grid boxes already characterized as BIO.

Aged/transported aerosols (AGED)
Air masses with enhanced CO but low levels of NO 2 are assumed to have been transported away from their sources.The AGED source is therefore assigned when CO, which Central Africa East Pacific has a long lifetime, is enhanced but the shorter-lived NO 2 is not.Aging may change average aerosol properties by dilution, mixing with other air masses, processing within clouds, or other mechanisms.Hence, all neutral and non-absorbing aerosol types qualify as AGED.

Volcanic sulfate (VOG)
Secondary aerosols formed by the reaction of volcanic SO 2 with the atmosphere are named volcanic smog (VOG) here to distinguish them from anthropogenic sulfate.GACA-source can only detect VOG in remote locations, as one requirement for the assignment is the lack of enhancements in NO 2 and CO.In addition, the SO 2 mean and correlation with AOD need to pass the thresholds.Freshly formed sulfate aerosols are small, but can grow rapidly due to their hygroscopicity; therefore small and medium-sized aerosol types can be assigned to VOG.Both non-absorbing and neutral aerosol types qualify because the sensitivity of UVAI to nonabsorbing aerosols is not very high.

Sea salt (SS)
Breaking waves and bursting bubbles cause the release of sea salt particles.The particles are hygroscopic and grow readily in the marine boundary layer, forming large, non-absorbing particles.The emission of SS depends mainly on wind speed and geography (e.g., coastlines) but is not associated with the emission of trace gases.GACA-source attributes SS as a main source to grid boxes with mean AOD < 0.15 and no trace gas enhancements; only non-absorbing and neutral, large and medium-size type aerosols are eligible candidates.GACA does not discriminate between grid boxes located over land and ocean; therefore, the SS type is also regularly found over land and may be interpreted as a generic background type.

Unknown source (XX)
If all tests leading to the above-mentioned aerosol sources fail but significant amounts of aerosols are detected (mean AOD > 0.05), the aerosol source is set to "unknown".

Source assignment
Means and correlation coefficients are calculated from all valid data points within a grid box if the fraction of valid points amounts to at least 25 % of all points (down to an absolute minimum of five).The tests performed by GACAsource are based on thresholds (given in Table 3), the values of which were chosen empirically.The source assignment criteria were chosen based on textbook knowledge (e.g., that biomass burning is associated with HCHO and CO emissions), as detailed for each source type in Sects.3.3.1-3.3.8, and were adjusted iteratively to obtain consistent results.The quantitative understanding of aerosol-trace gas relationships, however, is currently not sufficient to derive trace gas thresholds in a systematic way, hence the trace gas thresholds were determined in a more empirical fashion.The thresholds were empirically chosen high enough to exclude noise (or natural variability), but low enough that the associated sources are recognized.The CO threshold, for example, was chosen low enough to include aged air masses.The SO 2 threshold, on the other hand, had to be set sufficiently high to exclude noise.The thresholds were chosen independent of region and season to keep the algorithm globally consistent.A future development of GACA may be the adoption of threshold climatologies to better account for regional and seasonal variability of trace gas and aerosol emissions (see Sect. 5.4).
In Fig. 3   June-August 2007-2011 are plotted together with their respective thresholds (colored lines) so that, if data points lie above the respective threshold, the trace gas is assumed to be associated with the local aerosols.In the left panel (central Africa), HCHO (green) and CO (light blue) are strongly enhanced.The level of NO 2 (blue) clearly exceeds 10 −15 , the threshold for both NO 2 and SO 2 .This is in contrast to SO 2 , which is close to or even below the detection limit, leading to scatter of data and negative values.The dominating source is BB, because (1) the dominating aerosol type is medium-size absorbing and (2) the correlation between CO and AOD is high (R 2 = 0.71).Over the remote eastern Pacific Ocean (right panel) the trace gas means and correlations usually fall below the threshold values; however, due to prodigious degassing of the Kilauea Volcano (especially in 2008) strongly enhanced SO 2 columns can be observed in the selected grid box.In the atmosphere SO 2 is converted to sulfate aerosols, resulting in a good correlation between AOD and SO 2 of R 2 = 0.53.The dominating aerosol types are large neutral and large nonabsorbing; the main source assigned to this grid box is volcanic sulfate (VOG).

Aerosol type
We applied GACA-type to the 2007-2011 data set to study the seasonal cycle of aerosol properties globally.Figure 4 www.atmos-chem-phys.net/15/10597/2015/shows maps of the dominating aerosol type on a 1 • × 1 • resolution for all four seasons.Focusing first on the summer (third panel), it can be seen that the dust belt, at around 10-40 • N, is dominated by large particles (dark hues) with strong to moderate absorption (red and green tones).Smoke plumes from central Africa consist mostly of small to medium-size absorbing particles (orange and red), although there appears to be a significant contribution from large absorbing (LA) particles, which is probably an artifact that will be discussed in more detail in the next section.North America, Europe and large parts of Asia are dominated by small, non-absorbing aerosols (light blue).Over ocean, particularly in the southern oceans, large particles (dark blue and green) dominate.Light gray areas denote regions where no AOD data were available (due to e.g., clouds, snow or ice cover, low sun) or where monthly mean AOD did not exceed 0.05 within the studied period.
In winter and spring (December-February; March-May) the contribution of mineral dust to the aerosol mix over China can be clearly seen: the aerosol type is dominated by larger, more strongly absorbing particles than in summer.The burning of cropland and agricultural waste in Southeast Asia stands out in spring, when aerosol types are predominantly absorbing (red and orange).The biomass burning season in South America, which starts in July-August and peaks in September-October, has a very different signature than that in southern Africa: the particles are smaller and appear less absorbing.This may be a consequence of the difference in fuel type (e.g Eck et al., 2013), which leads to different trace gas and aerosol emission factors.But the main causes are probably the increased cloudiness, which leads to lower UVAI values and more data gaps in the trace gas products, and the large abundance of (non-absorbing) secondary organic aerosols.Despite the fact that wildfires occur frequently in summer in North America, BB is not selected as a major source there.This is because forest fires occur at irregular intervals, so that their signal is suppressed as a consequence of averaging data in time and space.
The frequency of occurrence of each aerosol type can be used to study changes in aerosol composition as a function of time (or distance to the source).As an example, the westward transport of Saharan dust over the Atlantic Ocean is shown in Fig. 5.The upper panel displays the mean total AOD along a longitudinal transect from 10 • E to 80 • W, at 15-20 • N (see yellow box in panel 3 of Fig. 4).The lower panel presents the aerosol fraction, weighted by AOD, for the same transect.Only the three large aerosol types (LNA, LN, and LA) are shown, the other types never contribute more than 20 % to the total AOD.Close to the source, situated at roughly 10 • E-10 • W, the aerosol load is almost completely made up of large absorbing particles (LA, brown triangles).West of about 25 • W, the fraction of large neutral aerosols (LN, green crosses) starts increasing until it becomes the dominating particle type at 50 • W, where the total AOD has decreased to 0.3 (from a maximum of 0.75).This apparent change in absorption is mainly due to the fact that we use UVAI as a measure for absorption: as UVAI increases with AOD and aerosol altitude, the gradual descent of the dust layer (Colarco et al., 2003) combined with the decreasing AOD causes UVAI to fall below the upper threshold value of 0.25.This indicates that GACA underestimates dust abundance far from its source.2; gray areas do not contain more than four points belonging to the relevant aerosol type.

Source type
The results from a run of GACA-source with data from 2007-2011 are shown in the form of seasonal global maps with 2 • × 2 • resolution in Fig. 6.The upper frame shows the main source type in winter.Most of the continental Northern Hemisphere aerosols are of urban/industrial origin (URB, dark blue), except where mineral dust (DD, red) predominates (in northern Africa, the southern Arabian Peninsula, and northwestern China).Biomass burning smoke (BB, dark red) can be found in sub-Sahelian Africa in this season, as well as over parts of Southeast Asia.The forested part of South America is a large source of secondary organic particles (BIO, dark green).Aged aerosols (AGED, blue-gray) can be seen in the outflow from Asia (India, China) and are also found in the air masses transported from equatorial Africa over the Atlantic.Most of the aerosols over oceans are classified as sea salt (SS, light blue), although aerosols of undefined composition (XX, dark gray) are found in the Asian outflow over the Pacific and the African outflow over the Atlantic.The band of aerosols at 40-60 • S (also seen in March-May) is caused by unrealistically high AOD mainly due to inaccurate wind speed assumptions and residual cloud contamination in the MODIS retrieval (Levy et al., 2013;Schutgens et al., 2013) and may be ignored.In spring and summer (second and third panels of Fig. 6) more dust is activated within the global dust belt.The amount of biomass burning smoke also increases as first the agricultural fires in Southeast Asia reach their springtime peak and then the Southern Hemisphere fire season starts in summer.A con-spicuous sulfate (VOG) plume is seen emerging from Hawaii and is mainly due to prodigious degassing in April-October 2008 by the Kilauea Volcano (19.4 • N, 155.3 • W) (see, e.g., Yuan et al., 2011;Beirle et al., 2014).The misclassification of SS aerosols over continents in the high latitudes is most apparent in fall (lower-most panel).These grid boxes show no enhanced trace gas concentrations and have mean AOD < 0.15, corresponding to the definition of SS in GACA.These aerosols may be regarded as background aerosols of which the source cannot reliably be determined by GACA.
Whereas Fig. 6 depicts the main aerosol source, determined from all data points within a grid box, Fig. 7 shows the aerosol source determined for each of the nine aerosol types separately.The data are from June-August 2007-2011: the same data set as shown in the third panel of Fig. 6.The three absorbing aerosol types (small, medium-size and large) are shown in Fig. 7a-c.Medium-sized and large absorbing aerosols north of the Equator are almost exclusively attributed to mineral dust; the apparent band of desert dust at 60 • S is caused by a few data points with unrealistically high AOD, as mentioned above, in addition to erroneous (high) UVAI values that are probably caused by small scattering angles (90-100 • ) encountered in this region.The smoke plume off the southwestern coast of Africa in panel (a) is rather unusual, as biomass burning particles are usually small.This is caused by the use of EAE as a measure for aerosol size; although the EAE is < 0.75, the FMF in this region is on the order of 0.7, indicating a large fraction of small aerosols (not shown).It is unclear why EAE and FMF show opposing behavior in this region.We speculate that it has to do with the persistent low-cloud cover during the biomass burning season, which may cause enhanced cloud contamination and, possibly, a wrong choice of aerosol model.However, as it was pointed out in various studies, the size information retrieved by MODIS is not very reliable (e.g., Remer et al., 2005;Levy et al., 2010) and we do not pursue the issue further (but see Sect.5.2 for a discussion on EAE and FMF).GACA assumes that the source of all small absorbing aerosols is biomass burning; therefore, no other source type is seen to contribute in Fig. 7c.
Neutral aerosol types, shown in Fig. 7d-f, come from various sources: URB, SS, AGED, MIX, BB, and some DD.Large and medium-sized non-absorbing aerosols (Fig. 7g-h) are dominated by SS with contributions from URB and MIX.Small non-absorbing (Fig. 7i) is the dominating aerosol type throughout the eastern United States, most of Europe and eastern China (compare lower right panel of Fig. 4), where URB is the main source.Large parts of South America and southern Africa can be seen to emit BIO aerosols (which are assumed to be exclusively small non-absorbing particles), but the AOD-weighted main aerosol source in those regions is BB -in contrast to Southeast Asia, where BIO is the dominant aerosol source in this season (Fig. 6).Performing the analysis by GACA-source on each aerosol type separately allows for an insight into the aerosol mixture that cannot be seen when studying the main source map only.
Additional information on the sources can be gained by adding the trace gas information that was not directly used for source assignment.For example: the URB source is assigned based only on the presence of enhanced NO 2 (after exclusion of BIO as source type; see Fig. 2), but the information on other trace gas means is retained.By adding binary coding, i.e., values of 1, 2, and 4 to grid boxes with enhanced mean values of HCHO, SO 2 , and CO, respectively, we obtain the map presented in Fig. 8 for June-August 2007-2011.If only NO 2 is enhanced, the grid box has an index of 0 and appears dark blue.If, in addition, HCHO is enhanced, the grid box obtains an index of 0 + 1 = 1 and appears in a lighter shade of blue.Grid boxes with enhanced NO 2 , HCHO, and CO are indexed 0 + 1 + 4 = 5 and are shown in orange.Grid boxes with a main source other than 10609 URB are shown in light gray.This analysis reveals a great diversity in urban/industrial emissions.A clear separation can be seen in Europe, with enhancements of NO 2 in the west but additionally enhanced HCHO in the east, confirming the findings of Veefkind et al. (2011).Further east, urban/industrial aerosols are again only associated with increased NO 2 columns.Throughout most of the Indian subcontinent, both NO 2 and HCHO are enhanced.The increased HCHO levels are mainly due to human activities: industrial and vehicle exhaust, biogenic emissions from agriculture (direct and indirect, by CH 4 oxidation) and burning of biomass (e.g., household fires) (Stavrakou et al., 2009).A similar pattern can be seen over northern Thailand.Northeastern China emits large quantities of all trace gases investigated here, in addition to aerosols.It is one of only a few regions where anthropogenic SO 2 can be detected from satellite -another being the Highveld in South Africa, which stands out in light blue (enhanced NO 2 and SO 2 ).Grid boxes colored yellow and orange (increased CO, or increased CO and HCHO levels) mostly appear at the edges of regions with intense biomass burning in this season (central Africa, central South America) and are influenced by fire emissions or are possibly misclassified.In southern South America, the South Atlantic Anomaly causes errors in trace gas retrievals that show up mainly in erroneously enhanced SO 2 values.Although the HCHO retrieval is similarly affected, the threshold used in GACA-source is high enough to exclude those outliers.Urban aerosols in North America are accompanied by enhanced HCHO levels, which is mostly of biogenic origin (Goldstein et al., 2009;Stavrakou et al., 2009).The filament with enhanced CO and HCHO seen on the northeast coast of the United States is possibly due to transported wildfire smoke from the northwest (Canada/Alaska).Similar patterns are found for the other seasons (see Fig. S1 in the Supplement).

Comparison with MACC
Main aerosol types from the MACC reanalysis for 2007-2011 were grouped by season and treated analogously to the measured data with respect to the minimum AOD threshold of 0.05 and the removal of outliers.The dominating aerosol component is shown in Fig. 9 in a similar fashion to Fig. 6, but there are two important differences.
1.The aerosol components are different (Benedetti et al., 2009): black carbon (BC, black) originates mostly from biomass burning but also occurs in urban regions due to e.g., vehicle exhaust or household fires.In Fig. 9, AOD due to BC is additionally shown in contours (AOD = 0.02-0.1), to indicate the regions affected by biomass burning, as BC constitutes only a small fraction of aerosol emissions by fires.The contribution of organic matter (OM, green) to biomass burning smoke is much greater, but OM also has important biogenic and anthropogenic sources.The types desert dust (DD) and sea salt (SS) are equivalent to the source types of the same name in GACA and are therefore indicated with the same colors (red and light blue, respectively).Sulfate aerosols (SO 4 ) are indicated in the same color, blue, as URB aerosols in GACA, because the sources are assumed to be similar.The aerosol type is set to MIX when none of the aerosol components contributes more than 50 % to the total AOD.
2. MACC data are also inherently different from GACA data in that each data point contains contributions of each aerosol component (BC, OM, DD, SO 4 , SS), whereas GACA determines only one dominating aerosol source per grid box or, at most, one dominating aerosol source for each aerosol type found in a grid box.
At a first glance, the agreement between GACA-source and MACC is quite good (compare Figs. 6 and 9): the general spatial and seasonal patterns of DD and SO 4 (or URB) agree well.The biomass burning regions roughly agree, although the model does not show BC in South America in summer, where GACA sees a lot of BB (mostly due to fires in August, as can be seen in MODIS fire count patterns).In addition, GACA selects BB as the main source of AOD in sub-Sahelian Africa in the first half of the year, whereas in MACC DD dominates.On the other hand, the agricultural fires in Southeast Asia in spring are well captured by both GACA and MACC.The main sources of BIO (or OM) agree in GACA and MACC, but the source in the southeastern USA is missed by the model.The differences between MACC and GACA will be discussed in more detail in the following section, where regional seasonal cycles are investigated.

Regional seasonal cycles
Six 5 • ×5 • regions were selected for the study of the seasonal cycle: central South America (1), southern Africa (2), southeastern USA (3), northwestern Europe (4), Thailand (5), and northeastern China (6); the regions are shown as enumerated yellow boxes in the third panel of Fig. 6.For each season of each year (2007)(2008)(2009)(2010)(2011), the AOD of every aerosol type is shown in panels (a1)-(a6) of Figs.10-12.The dominant aerosol source was determined for each individual aerosol type separately and is shown in panels (b1)-(b6).The AOD fractions are therefore equal in panels (a) and (b) of Figs.10-12.For example, the bar representing the fall (September-November) of 2007 in Fig. 10a1 contains contributions from MA, SA, SN, and SNA types.The AOD fraction corresponding to SNA reappears in Fig. 10b1 in dark green (BIO), the dominant source of the SN fraction is URB (blue), and the summed AOD from SA and MA types is attributed to BB (brown).Panels (c1)-(c6) of Figs.10-12, finally, display the AOD corresponding to the MACC aerosol types for the same regions.All data presented in Figs.10-12 can be found in Tables S1-S6 in the Supplement.
The first two regions, central South America and southern Africa (panels a1-c1 and a2-c2 of Fig. 10, respectively), are  2. characterized by seasonal biomass burning.The fire season starts in late summer in South America; the highest number of fires is usually found in fall.The high year-to-year variability of biomass burning in this region is clearly reflected in all three panels.Both GACA and MACC ascribe the larger part of AOD in winter and spring to secondary organic aerosols (BIO and OM in GACA and MACC, respectively).Although the DD contribution in the model appears to be somewhat high (no DD is detected by GACA), the agreement between GACA and MACC is good for this example.Good agreement is also found for southern Africa, where smoke forms the major part of the aerosol mixture during the fire season in summer, when the highest AOD are detected.All panels show that the year-to-year variation is much smaller than in South America.Urban/industrial aerosols appear to be overestimated by GACA, whereas MACC shows higher contributions of DD.
The regions southeastern USA and northwestern Europe are dominated by non-absorbing aerosols (Fig. 11a3, a4).Throughout most of the year, aerosols over the southeastern USA are of urban/industrial origin (URB and SO 4 for GACA and MACC, respectively).In summer this region is dominated by secondary organic aerosols (Goldstein et al., 2009), clearly seen by GACA (Fig. 11b3), which attributes nearly all AOD to BIO.MACC, on the other hand, only shows a slight increase in OM relative to the other seasons.The contributions of dust and sea salt to the aerosol mixture appear to be too large in the model in comparison to GACA results, which points to sources missing in the model: MACC scales the aerosol amount with MODIS AOD but keeps the mass fractions of the different aerosol components constant (see Sect. 2.2).Hence, if a source is missing, e.g., secondary organic aerosols, the AOD due to those aerosols is spread over the remaining components.The small year-to-year variation observed in MACC aerosol composition is a result of this procedure.
There is no clear aerosol seasonal cycle recognizable in northwestern Europe (Fig. 11a4-c4): the AOD is rather constant throughout the year and the composition rarely deviates from the urban/industrial (URB and SO 4 ) type.In winter there is a larger contribution of medium-size and large particles (Fig. 11a4), which GACA-source has trouble identifying but which MACC attributes to sea salt.As in all previous regions, the model sees significant amounts of dust that are not detected by GACA.This can partly be explained by too low deposition rates in the model but may also be due to the fact that GACA does not select DD as a source if any trace gas means are enhanced (unless the aerosol type is large absorbing).
Figure 12 presents the seasonal cycle for two regions in Asia.In winter and particularly in spring, agricultural fires in Thailand release large quantities of smoke, as seen by both GACA and MACC (Fig. 12a5-c5).During the rainy season (June-October) secondary aerosols dominate, both from an-thropogenic (URB and SO 4 ) and biogenic sources (BIO and OM).MACC finds significant contributions of dust which are not seen by GACA.
In northeastern China, the seasonal mean AOD is greater than 0.5 throughout the year for each year from 2007 to 2011 (Fig. 12a6-c6).Most of the AOD can be attributed to aerosols of anthropogenic origin (URB and SO 4 ), but a large fraction is caused by mineral dust transported from deserts in Mongolia, northern China, and Kazakhstan, especially in winter and spring.In view of their sizes (medium to large), most of the aerosols characterized as BB by GACA are probably polluted dust or dust in the presence of pollution, i.e., NO 2 , HCHO, SO 2 or CO.The variability of the seasonal cycle of DD appears to be underestimated by MACC (compare Fig. 12a6 and c6).The amount of modeled BC in China is as high as for South America in the biomass burning season (see Fig. 10c1), which may be reflected by the high levels of aerosol absorption found by GACA for northeastern China.The more probable source of absorbing aerosols is, however, desert dust.

Discussion
GACA is a threshold-based algorithm for the determination of dominant aerosol types and sources globally on a seasonal basis.In this section we investigate the robustness of the algorithm, motivate our choice of EAE (as opposed to FMF), and compare results from GACA with previously reported climatologies from measurements and models.Although the algorithm can be improved further by fine-tuning with regional settings and/or additional (satellite) data, the main objective of the current study is to explore what can be learned from the combination of different satellite data sets.We present some suggestions for future improvements to GACA in Sect.5.4.

Sensitivity studies
It is clear that GACA results depend on the choice of thresholds and criteria for aerosol type and source determination.Most source assignments are rather robust and altering thresholds only causes small shifts of borders between different sources.Beyond being rooted in textbook knowledge, our criteria are justified by the consistency of the obtained results and the good general agreement with MACC model results.The basic assumption underlying GACA is that enhancements in trace gas and aerosol abundance are caused by the same source and wherever this is not the case, the algorithm fails.Correctly characterizing mixed air masses (e.g., dust with smoke or pollution) or transported aerosols (that may be present above or in addition to local pollution) thus is beyond the capabilities of GACA.
To investigate how robust GACA is with respect to effects of clouds, varying time ranges, and the treatment of outliers, we performed a series of tests.First, we applied different cloud filters to the GOME-2 data prior to gridding.Unfortunately, a similar test could not be performed on MO-PITT data, as we used gridded monthly means that had already been cloud-cleared.MODIS AOD is only retrieved under clear sky conditions, but because the field of view of the instrument is small, retrievals in between cloud patches are often possible in regions that would be considered cloudy by GOME-2.Setting the maximum effective cloud fraction (CF) to 0.05, 0.20, or 0.40 does not cause major changes in global maps of respectively).Perhaps surprisingly, the results are still similar if only data with CF > 0.40 are selected: the main difference is the disappearance of nonabsorbing aerosol types due to the increase in data points with UVAI < 0. We conclude that measurements of NO 2 , HCHO, SO 2 , and UVAI in the presence of clouds contain enough information to be used for characterization of aerosol (or air mass) sources, at least on a monthly mean basis.Measurements of other trace gases, e.g., CO, are expected to be similarly useful (e.g., Liu et al., 2014).
The effects of varying the time range from the maximum of 15 months per season (5 years × 3 months) are rather trivial: the scatter increases with decreasing data amount, and so does the influence of one-time events, such as volcanic eruptions.We performed tests for the summer (June-August) and found that GACA-type and GACA-source results are very similar if data from 2007-2011 or 2008-2010 are used.Decreasing the time window further to July 2007-2011 (5 months) or to June-August 2009 causes noisy results with large data gaps (particularly over South America).For source determination of individual aerosol types (as in Fig. 8), the statistical requirements are even higher.Changing the resolution of GACA-source to 1 • × 1 • yields dominant source maps very similar to those in Fig. 6 but with several large data gaps, most notably over South America in summer.
In the standard GACA setup, each data set is screened for outliers which are then removed (see Sect. 3.1 for details).The reason for this procedure is that GACA is aimed at constructing a climatology in which exceptional events (large fires, volcanic eruptions, etc.) should not be represented.Another reason is the removal of artifacts which are, however, only rarely encountered in the monthly averaged, gridded data sets used here -except in the region affected by the South Atlantic Anomaly.If GACA is run without removing outliers, the resulting source maps are very similar to those from the standard run (compare Fig. 6 with Fig. S4 in the Supplement); in fact, the map for winter does not change at all.The biggest change is found for the spring maps, where several volcanic sulfate (VOG) plumes appear, e.g., most prominently the one from the Fernandina Volcano on the Galapagos Islands, which erupted in April 2009.VOG plumes from degassing (Kilauea, Hawaii, 2008) and erupting (Nabro, Eritrea, 2011) volcanoes are also seen more clearly in the summer map when outliers are not removed.The largest change in summer is caused by the exceptional fire season that occurred in 2010 in Russia.Because GACA uses AOD weighting, the thick, persistent smoke plumes strongly influence the algorithm, despite the fact that the fires occurred in only 2 out of 15 months considered.In South America more grid boxes are assigned to BB, replacing URB; the same is seen in fall, although there BB replaces several assignments of BIO if outliers are included in the analysis.

Extinction Ångström exponent and fine-mode fraction
Throughout this study, EAE is used as a measure of aerosol size, instead of the often-used FMF (also denoted as η in the MODIS literature).The main reason is consistency among the three MODIS aerosol algorithms: the Deep Blue algorithm does not output FMF, and although both dark target algorithms (land and ocean) provide values of FMF, the definitions are different.The MODIS over-ocean retrieval adjusts the abundance of two aerosol types -one fine-mode, one coarse-mode -to best fit the measured radiance at six wavelength bands.The two types are chosen from a total of nine aerosol types (four fine, five coarse), each represented by a single lognormal size distribution.The over-ocean FMF is the radiance fraction attributed to the fine-mode aerosol type (Remer et al., 2005).Over (dark) land, the FMF represents the weighting of fine-dominated and coarse-dominated models, which each consist of fine and coarse mode(s).In practice, FMF is essentially binary, rarely deviating from either 0 or 1 (Levy et al., 2010).
The definition of EAE, on the other hand, is unambiguous (Eq.1).Throughout the course of the MODIS retrieval, AOD is determined at each of the wavelengths used in the retrieval, hence EAE can be computed for several wavelength combinations.Here, 470 and 660 nm were chosen, as these are the only two wavelengths used in all three MODIS retrievals.Despite the fact that like FMF, EAE is affected by a priori assumptions of aerosol optical properties and surface reflectance (over land), the monthly pattern of EAE corresponds to the global distribution of dust and non-dust (Remer et al., 2005) and this is sufficient for the application presented here.For spatially and temporally higher-resolved characterization studies, however, a different (or additional) metric may need to be used, e.g., size and/or shape from instruments like the MISR (Kahn et al., 2005) or POLDER (Polarization and Directionality of the Earth's Reflectances; Tanré et al., 2011).

Comparison with other climatologies
Different aerosol climatologies of microphysical aerosol properties (or proxies) have been constructed using remotely sensed data in the past.The most established empirical climatologies are derived from AERONET (Aerosol Robotic Network) data (Dubovik et al., 2002;Omar et al., 2005;Levy et al., 2007a;Lee et al., 2010).At a first glance, the agreement between GACA-source and AERONET-derived climatologies (e.g., Fig. 2 in Omar et al., 2005, Fig. 3 in Levy et al., 2007a, or Fig. 2 in Lee et al., 2010) is good.However, due to large differences in spatial sampling and the limited information available from AERONET, the informative value of such a comparison is limited.More recently, largescale collaborations between various modeling groups have shown that a combination (or mean) of aerosol properties from different models performs better (i.e., display smaller differences with measurements) than the output of any single model (e.g., Kinne et al., 2013;Sessions et al., 2015).The resulting climatologies (Fig. 2 in Kinne et al., 2013, andFig. 3 in Sessions et al., 2015) are in agreement with GACA regarding the dominating aerosol type.But, again, the gain from such a comparison is limited because there is no separation of aerosol types in the presented model climatologies apart from that between fine and coarse modes.It would be more interesting to compare the aerosol composition from the model climatologies with GACA-source, but this is beyond the scope of the current study.Recently published model data of global aerosol composition (Chin et al., 2014) allow for a more detailed comparison with GACAsource results.The agreement between our Figs.10-12 and Chin's Fig. 6a (where regional annual average AOD composition from 1980-2009 is shown) is good; many of the discrepancies between GACA-source and GOCART (Goddard Chemistry Aerosol Radiation and Transport) model results may be attributed to the differences in geographical selection.There are, however, some important differences, two of which point to inaccuracies in the modeling of secondary organic aerosols.In the regions of southern USA and South America, GOCART clearly underestimates the amount of organic matter contributing to aerosols.This is particularly evident in South America, where both GACA-source and MACC ascribe the major part of AOD to secondary organic aerosols throughout the year, whereas in GOCART sulfate aerosols contribute almost 50 % to the yearly mean AOD.Additionally, the amount of desert dust appears to be high compared to GACA.The general underestimation of secondary organic and biomass burning aerosols, as well as the overestimation of desert dust by the GOCART model is known (Chin et al., 2014) and might be remedied with the help of an algorithm like GACA.

Applications and improvements
The presented algorithm is an attempt at determining dominating aerosol types and sources on a global scale and mainly intends to show the potential of combined trace gas and aerosol data sets.The most important application of an algorithm like GACA is the improvement of model emissions of aerosols and trace gases, as suggested in the study by Xu et al. (2013).Not only models that rely on data assimilation (like MACC, now succeeded by CAMS) may benefit from comparisons with GACA.The possibility of selecting certain aerosol types (e.g., small non-absorbing aerosols) or sources (e.g., urban/industrial) for more detailed investigations of the relationships between AOD and trace gases is a useful tool for the assessment of model performance regarding aerosols and may assist in finding strategies to improve aerosol pa-rameterization.In addition, GACA is rather robust despite the flexibility with respect to temporal and spatial resolution and input data.
There is a multitude of possible adaptations for an algorithm like GACA, but here we focus on three.
1. Adaptation of GACA to shorter time periods and smaller spatial scales.The algorithm as such can be easily applied to daily Level-2 data (on a single-pixel scale), with the caveat that co-location of the measurements then becomes more important.This could be achieved using data from a single instrument (e.g., GOME-2 or OMI), from different instruments on the same platform (GOME-2 and Infrared Atmospheric Sounding Interferometer (IASI); OMI and Tropospheric Emission Spectrometer, TES), or from instruments closely following each other, as in the A-Train.Such an approach could be directly applied to atmospheric composition modeling through global data assimilation, e.g., in CAMS.Using the combined information from different satellite observations, the aerosol type could be updated in addition to the total AOD, yielding a more realistic mix of aerosol composition.
2. Application of GACA to cloudy data, i.e., aerosol and trace gas measurements of pixels with high cloud cover.
As shown above, trace gas measurements of cloudy pixels contain enough information to be used for aerosol characterization.These would have to be combined with aerosol retrievals over clouds, e.g., from MODIS or OMI (Torres et al., 2012;Jethva et al., 2013Jethva et al., , 2014)).
3. Modification of GACA to ground-based data.For example, multi-axis-DOAS (MAX-DOAS) measurements of trace gases could be combined with aerosol data from a sun photometer (e.g., AERONET) to assess local aerosol sources.
Possible future improvements include (a) the use of more aerosol data, e.g., particle shape and aerosol layer height (e.g., from POLDER or MISR) or more trace gas data from GOME-2 (glyoxal) or other instruments; (b) making use of spatial and/or temporal patterns and correlations, e.g., by taking into account the results from neighboring grid boxes or by pattern recognition; and (c) replacing the fixed thresholds with a threshold climatology that depends on location and season.

Conclusions
Aerosols and trace gases are frequently co-located, and often even correlated, because they are (1) emitted by the same sources, e.g., in the case of biomass burning smoke; (2) formed from the same precursor, e.g., volatile organic compounds and secondary organic aerosols; or (3) formed from trace gases in the atmosphere, e.g., sulfate aerosols Atmos.Chem.Phys., 15, 10597-10618, 2015 www.atmos-chem-phys.net/15/10597/2015/from SO 2 .We exploit this fact for the assessment of the dominant aerosol source from satellite observations.In this paper, we introduce a strategy for the systematic classification of aerosols using the combination of aerosol optical depth and extinction Ångström exponent from MODIS with UV Aerosol Index and trace gas columns (NO 2 , HCHO, and SO 2 ) from GOME-2, and CO columns from MOPITT.Our Global Aerosol Classification Algorithm, GACA, is separated into two main steps: first, an aerosol type is determined based on its optical properties; subsequently, trace gas information is added to appoint a dominant aerosol source.
The obtained global yearly and seasonal maps are generally in good agreement with MACC model data, indicating that both are legitimate.However, systematic differences are also found: more desert dust and less secondary organic aerosols are indicated by MACC than by GACA.This demonstrates the potential of our method -combining aerosol and trace gas data -to evaluate and investigate aerosol treatment (parameterization, sources, transport, aging and removal processes) in air quality and climate models.One possible application of an algorithm like GACA is the updating of both aerosol and trace gas emissions, e.g., in CAMS (successor of MACC) or in GEOS-Chem, as suggested in the study by Xu et al. (2013).Since the mix of aerosol types is currently preserved in models, a combined data assimilation of aerosol and trace gas observations would lead to an overall more realistic representation of aerosols by models.We find that the rather simple, threshold-based GACA suffices for very plausible results that are quite robust with respect to outliers, choice of time range and cloud fraction thresholds.We emphasize, however, that the presented study is exploratory in nature.We provide several suggestions for improvement of the algorithm.With the coming new generation of space-based DOAS instruments with high spatial resolution, in particular TROPOMI (Tropospheric Monitoring Instrument on the polar-orbiting Sentinel-5p platform; Veefkind et al., 2012) and the geostationary Sentinel 4 (Ingmann et al., 2012), more (cloud-free) data will be available.With such instruments, global aerosol-type maps with even higher spatial and temporal resolution become feasible.These maps may find a wide range of applications: from modelers, who can use the information to verify emissions and aerosol processes, to scientists working to update aerosol climatologies used in the retrieval of aerosol optical depth (e.g., MODIS) or trace gas columns, and environmental policy makers, for the development of effective mitigation strategies.
The Supplement related to this article is available online at doi:10.5194/acp-15-10597-2015-supplement.

Figure 1 .
Figure 1.Speciation of aerosol types based on absorption (UVAI) and size (EAE).Left, aerosol types color-coded according to size (larger sizes have darker hues) and absorption (non-absorbing in blue, neutral in green, absorbing in red): LA, large absorbing; MA, medium-size absorbing; SA, small absorbing; LN, large, neutral; MN, medium-size, neutral; SN, small, neutral; LNA, large, nonabsorbing; MNA, medium-size, non-absorbing; SNA, small, nonabsorbing.Right, monthly mean UVAI and EAE within grid boxes in regions dominated by desert dust (red dots), biomass burning smoke (gray crosses), secondary biogenic aerosols (green circles), and sea salt (light blue pluses).Data are from June-August 2007-2011; see the text for the selected geographical regions.

Figure 2 .
Figure 2. Schematic decision tree of GACA.The corresponding threshold values are given in Table3.The mean value of a quantity, e.g., CO, is denoted CO; the coefficient of correlation between AOD and a quantity, e.g., HCHO, is denoted R 2 (HCHO).Thresholds are denoted as e.g., SO 2,thresh , R 2 thresh , ratio thresh (for the HCHO : NO 2 ratio threshold), or AOD SS-thresh (for the maximum AOD allowed for SS classification).Other abbreviations are explained in Table2.

Figure 3 .
Figure 3. Relationship between 1 • × 1 • monthly mean values of AOD and trace gas columns (in molec cm −2 ) for a region in central Africa (2-4 • S, 18-20 • E; left panel) and in the eastern Pacific Ocean (16-18 • N, 162-164 • W; right panel) for July-August 2007-2011.Dots depict NO 2 (blue), HCHO (green), and SO 2 (red) VCDs and excess CO VCDs (light blue, scaled by a factor of 0.01) and their respective thresholds.The threshold values of NO 2 and SO 2 are identical (dotted blue and red lines).Note the differences in y axis scales.
we demonstrate the algorithm for two 2 • ×2 • grid boxes: the first shows data from a region in central Africa (2-4 • S, 18-20 • E) during the biomass burning season, whereas the second is located west of Hawaii, in a region of volcanic outflow at 16-18 • N, 162-164 • W. The trace gas columns for Winter (Dec.-Feb.)Spring (Mar.-May)Summer (Jun.-Aug.)Fall (Sep.-Nov.)

Figure 4 .
Figure 4. Seasonal cycle of global aerosol type distribution according to GACA.Data are from 2007-2011 and were divided into the four main seasons (from top to bottom): winter, spring, summer, and fall.The legend is given on the bottom; see Fig. 1 and Table 2 for aerosol-type abbreviations.The yellow box indicates the region investigated in Fig. 5.

Figure 5 .
Figure 5. Transect showing transport of mineral dust plumes.Shown are summertime (June-August 2007-2011) data from 15-20 • N, a region of Saharan dust outflow.Upper panel: mean AOD (total of all aerosol types); the mean wind direction is indicated by an arrow, and the surface type (land or ocean) is given at the bottom of the panel.Lower panel: AOD-weighted fraction of all aerosol types contributing > 20 % to AOD.

Figure 6 .
Figure 6.Seasonal cycle of global main aerosol source distribution according to GACA.Data are from 2007-2011 and were divided into the four main seasons (from top to bottom): winter, spring, summer, and fall.Aerosol source type abbreviations are given in Table 2; gray areas are not analyzed due to lack of data or too small mean AOD (see text for details).Enumerated yellow boxes in the third panel mark the regions investigated in Figs.10-12, respectively.

Figure 7 .
Figure 7. Global aerosol source for each aerosol type according to GACA for June-August 2007-2011.Aerosol source and type abbreviations are given in Table2; gray areas do not contain more than four points belonging to the relevant aerosol type.

Figure 8 .
Figure 8. Trace gas composition for grid boxes with URB source for June-August 2007-2011.The presence of enhanced trace gas columns (in addition to NO 2 ) is indicated by 1, 2, or 4 for HCHO, SO 2 , and CO, respectively: 1 thus indicates enhanced NO 2 and HCHO, 2 enhanced NO 2 and SO 2 , 3 enhanced NO 2 and HCHO and SO 2 , etc. Gray areas are not dominated by URB.

Figure 9 .
Figure 9. Seasonal cycle of global main aerosol type distribution according to MACC.Data are from 2007-2011 and were divided into the four main seasons (from top to bottom): winter, spring, summer, and fall.Aerosol types are black carbon (BC), mineral dust (DD), organic matter (OM), sulfate (SO 4 ), sea salt (SS), and mixture (MIX).Light gray areas (na) are not analyzed due to too small mean AOD.As BC does not dominate anywhere, contours show mean BC amount (AOD 0.02-0.1) to indicate regions affected by smoke; see text for details.
the Equator at 13:30 local time (LT) and performs its daylight measurements on the ascending part of its orbit; Terra is in a daytime-descending orbit and has a local Equator crossing time of about 10:30 LT.

Table 1 .
Data sets that are used as input to GACA, with appropriate literature references and websites.

Table 2 .
Abbreviations of aerosol types and sources used throughout this document.

Table 3 .
Thresholds used in GACA.Variables are unitless except for the trace gas (excess) VCDs (given in molec cm −2 ).