Journal cover Journal topic
Atmospheric Chemistry and Physics An interactive open-access journal of the European Geosciences Union
Atmos. Chem. Phys., 17, 3165-3197, 2017
http://www.atmos-chem-phys.net/17/3165/2017/
doi:10.5194/acp-17-3165-2017
© Author(s) 2017. This work is distributed
under the Creative Commons Attribution 3.0 License.
Research article
01 Mar 2017
Resolving anthropogenic aerosol pollution types – deconvolution and exploratory classification of pollution events
Mikko Äijälä1, Liine Heikkinen1, Roman Fröhlich2, Francesco Canonaco2, André S. H. Prévôt2, Heikki Junninen1, Tuukka Petäjä1, Markku Kulmala1, Douglas Worsnop1,3, and Mikael Ehn1 1Department of Physics, University of Helsinki, Helsinki, Finland
2Laboratory of Atmospheric Chemistry, Paul Scherrer Institute, Villigen, Switzerland
3Aerodyne Research Inc., Billerica, MA, USA
Abstract. Mass spectrometric measurements commonly yield data on hundreds of variables over thousands of points in time. Refining and synthesizing this raw data into chemical information necessitates the use of advanced, statistics-based data analytical techniques. In the field of analytical aerosol chemistry, statistical, dimensionality reductive methods have become widespread in the last decade, yet comparable advanced chemometric techniques for data classification and identification remain marginal. Here we present an example of combining data dimensionality reduction (factorization) with exploratory classification (clustering), and show that the results cannot only reproduce and corroborate earlier findings, but also complement and broaden our current perspectives on aerosol chemical classification. We find that applying positive matrix factorization to extract spectral characteristics of the organic component of air pollution plumes, together with an unsupervised clustering algorithm, k-means+ + , for classification, reproduces classical organic aerosol speciation schemes. Applying appropriately chosen metrics for spectral dissimilarity along with optimized data weighting, the source-specific pollution characteristics can be statistically resolved even for spectrally very similar aerosol types, such as different combustion-related anthropogenic aerosol species and atmospheric aerosols with similar degree of oxidation. In addition to the typical oxidation level and source-driven aerosol classification, we were also able to classify and characterize outlier groups that would likely be disregarded in a more conventional analysis. Evaluating solution quality for the classification also provides means to assess the performance of mass spectral similarity metrics and optimize weighting for mass spectral variables. This facilitates algorithm-based evaluation of aerosol spectra, which may prove invaluable for future development of automatic methods for spectra identification and classification. Robust, statistics-based results and data visualizations also provide important clues to a human analyst on the existence and chemical interpretation of data structures. Applying these methods to a test set of data, aerosol mass spectrometric data of organic aerosol from a boreal forest site, yielded five to seven different recurring pollution types from various sources, including traffic, cooking, biomass burning and nearby sawmills. Additionally, three distinct, minor pollution types were discovered and identified as amine-dominated aerosols.

Citation: Äijälä, M., Heikkinen, L., Fröhlich, R., Canonaco, F., Prévôt, A. S. H., Junninen, H., Petäjä, T., Kulmala, M., Worsnop, D., and Ehn, M.: Resolving anthropogenic aerosol pollution types – deconvolution and exploratory classification of pollution events, Atmos. Chem. Phys., 17, 3165-3197, doi:10.5194/acp-17-3165-2017, 2017.
Publications Copernicus
Download
Short summary
Mass spectrometric measurements commonly yield data on hundreds of variables over thousands of points in time. Refining and synthesising this “raw” data into chemical information necessitates the use of advanced, statistics-based data analysis techniques. Here we present an example of combining data dimensionality reduction (factorisation) with exploratory classification (clustering) and show that the results complement and broaden our current perspectives on aerosol chemical classification.
Mass spectrometric measurements commonly yield data on hundreds of variables over thousands of...
Share