Journal cover Journal topic
Atmospheric Chemistry and Physics An interactive open-access journal of the European Geosciences Union
Journal topic

Journal metrics

Journal metrics

  • IF value: 5.509 IF 5.509
  • IF 5-year value: 5.689 IF 5-year 5.689
  • CiteScore value: 5.44 CiteScore 5.44
  • SNIP value: 1.519 SNIP 1.519
  • SJR value: 3.032 SJR 3.032
  • IPP value: 5.37 IPP 5.37
  • h5-index value: 86 h5-index 86
  • Scimago H index value: 161 Scimago H index 161
Volume 18, issue 13 | Copyright
Atmos. Chem. Phys., 18, 9597-9615, 2018
https://doi.org/10.5194/acp-18-9597-2018
© Author(s) 2018. This work is distributed under
the Creative Commons Attribution 4.0 License.

Research article 09 Jul 2018

Research article | 09 Jul 2018

Identification of new particle formation events with deep learning

Jorma Joutsensaari1, Matthew Ozon1, Tuomo Nieminen1, Santtu Mikkonen1, Timo Lähivaara1, Stefano Decesari2, M. Cristina Facchini2, Ari Laaksonen1,3, and Kari E. J. Lehtinen1,4 Jorma Joutsensaari et al.
  • 1Department of Applied Physics, University of Eastern Finland, P.O. Box 1627, 70211 Kuopio, Finland
  • 2Institute of Atmospheric Sciences and Climate of the Italian National Research Council, Bologna, Italy
  • 3Climate research Unit, Finnish Meteorological Institute, Helsinki, Finland
  • 4Atmospheric Research Centre of Eastern Finland, Finnish Meteorological Institute, Kuopio, Finland

Abstract. New particle formation (NPF) in the atmosphere is globally an important source of climate relevant aerosol particles. Occurrence of NPF events is typically analyzed by researchers manually from particle size distribution data day by day, which is time consuming and the classification of event types may be inconsistent. To get more reliable and consistent results, the NPF event analysis should be automatized. We have developed an automatic analysis method based on deep learning, a subarea of machine learning, for NPF event identification. To our knowledge, this is the first time that a deep learning method, i.e., transfer learning of a convolutional neural network (CNN), has successfully been used to automatically classify NPF events into different classes directly from particle size distribution images, similarly to how the researchers carry out the manual classification. The developed method is based on image analysis of particle size distributions using a pretrained deep CNN, named AlexNet, which was transfer learned to recognize NPF event classes (six different types). In transfer learning, a partial set of particle size distribution images was used in the training stage of the CNN and the rest of the images for testing the success of the training. The method was utilized for a 15-year-long dataset measured at San Pietro Capofiume (SPC) in Italy. We studied the performance of the training with different training and testing of image number ratios as well as with different regions of interest in the images. The results show that clear event (i.e., classes 1 and 2) and nonevent days can be identified with an accuracy of ca. 80%, when the CNN classification is compared with that of an expert, which is a good first result for automatic NPF event analysis. In the event classification, the choice between different event classes is not an easy task even for trained researchers, and thus overlapping or confusion between different classes occurs. Hence, we cross-validated the learning results of CNN with the expert-made classification. The results show that the overlapping occurs, typically between the adjacent or similar type of classes, e.g., a manually classified Class 1 is categorized mainly into classes 1 and 2 by CNN, indicating that the manual and CNN classifications are very consistent for most of the days. The classification would be more consistent, by both human and CNN, if only two different classes are used for event days instead of three classes. Thus, we recommend that in the future analysis, event days should be categorized into classes of quantifiable (i.e., clear events, classes 1 and 2) and nonquantifiable (i.e., weak events, Class  3). This would better describe the difference of those classes: both formation and growth rates can be determined for quantifiable days but not both for nonquantifiable days. Furthermore, we investigated more deeply the days that are classified as clear events by experts and recognized as nonevents by the CNN and vice versa. Clear misclassifications seem to occur more commonly in manual analysis than in the CNN categorization, which is mostly due to the inconsistency in the human-made classification or errors in the booking of the event class. In general, the automatic CNN classifier has a better reliability and repeatability in NPF event classification than human-made classification and, thus, the transfer-learned pretrained CNNs are powerful tools to analyze long-term datasets. The developed NPF event classifier can be easily utilized to analyze any long-term datasets more accurately and consistently, which helps us to understand in detail aerosol–climate interactions and the long-term effects of climate change on NPF in the atmosphere. We encourage researchers to use the model in other sites. However, we suggest that the CNN should be transfer learned again for new site data with a minimum of ca. 150 figures per class to obtain good enough classification results, especially if the size distribution evolution differs from training data. In the future, we will utilize the method for data from other sites, develop it to analyze more parameters and evaluate how successfully CNN could be trained with synthetic NPF event data.

Download & links
Publications Copernicus
Download
Short summary
New particle formation (NPF) in the atmosphere is globally an important source of aerosol particles. NPF events are typically identified and analyzed manually by researchers from particle size distribution data day by day, which is time consuming and might be inconsistent. We have developed an automatic analysis method based on deep learning for NPF event identification. The developed method can be easily utilized to analyze any long-term datasets more accurately and consistently.
New particle formation (NPF) in the atmosphere is globally an important source of aerosol...
Citation
Share