Journal cover Journal topic
Atmospheric Chemistry and Physics An interactive open-access journal of the European Geosciences Union
Journal topic
Atmos. Chem. Phys., 18, 14511–14537, 2018
https://doi.org/10.5194/acp-18-14511-2018
Atmos. Chem. Phys., 18, 14511–14537, 2018
https://doi.org/10.5194/acp-18-14511-2018

Research article 10 Oct 2018

Research article | 10 Oct 2018

# A neural network aerosol-typing algorithm based on lidar data

A neural network aerosol-typing algorithm based on lidar data
Doina Nicolae1, Jeni Vasilescu1, Camelia Talianu1,2, Ioannis Binietoglou1, Victor Nicolae1,3, Simona Andrei1, and Bogdan Antonescu1 Doina Nicolae et al.
• 1National Institute of R&D for Optoelectronics, 409 Atomiştilor Str., Măgurele, Ilfov, Romania
• 2Institute of Meteorology, University of Natural Resources and Life Sciences, 33 Gregor-Mendel Str., 1180, Vienna, Austria
• 3Faculty of Physics, University of Bucharest, Atomiştilor 405, Măgurele, Ilfov, Romania

Correspondence: Doina Nicolae (nnicol@inoe.ro)

Abstract

Atmospheric aerosols play a crucial role in the Earth's system, but their role is not completely understood, partly because of the large variability in their properties resulting from a large number of possible aerosol sources. Recently developed lidar-based techniques were able to retrieve the height distributions of optical and microphysical properties of fine-mode and coarse-mode particles, providing the types of the aerosols. One such technique is based on artificial neural networks (ANNs). In this article, a Neural Network Aerosol Typing Algorithm Based on Lidar Data (NATALI) was developed to estimate the most probable aerosol type from a set of multispectral lidar data. The algorithm was adjusted to run on the EARLINET $\mathrm{3}\mathit{\beta }+\mathrm{2}\mathit{\alpha }\left(+\mathrm{1}\mathit{\delta }\right)$ profiles. The NATALI algorithm is based on the ability of specialized ANNs to resolve the overlapping values of the intensive optical parameters, calculated for each identified layer in the multiwavelength Raman lidar profiles. The ANNs were trained using synthetic data, for which a new aerosol model was developed. Two parallel typing schemes were implemented in order to accommodate data sets containing (or not) the measured linear particle depolarization ratios (LPDRs): (a) identification of 14 aerosol mixtures (high-resolution typing) if the LPDR is available in the input data files, and (b) identification of five predominant aerosol types (low-resolution typing) if the LPDR is not provided. For each scheme, three ANNs were run simultaneously, and a voting procedure selects the most probable aerosol type. The whole algorithm has been integrated into a Python application. The limitation of NATALI is that the results are strongly dependent on the input data, and thus the outputs should be understood accordingly. Additional applications of NATALI are feasible, e.g. testing the quality of the optical data and identifying incorrect calibration or insufficient cloud screening. Blind tests on EARLINET data samples showed the capability of NATALI to retrieve the aerosol type from a large variety of data, with different levels of quality and physical content.

1 Introduction

Aerosols represent an important component of the Earth's system with a significant impact on climate (Seinfeld et al.2016), weather (Fan et al.2016; Gayatri et al.2017; Marinescu et al.2017), air quality (Fuzzi et al.2015), biogeochemical cycles (Mahowald2011; Mahowald et al.2017), and health (Trippetta et al.2016). A wide variety of aerosols are present in the atmosphere at any time, originating from multiple natural (e.g. mineral dust, sea spray, biogenic emissions, volcanic eruptions) and anthropogenic sources (e.g. traffic, industrial activities, biomass burning) and having a large variability in space and time (Calvo et al.2013). This large variety and variability of the aerosols results in uncertainties of their impact. For example, aerosols can influence the microphysical properties of clouds and hence can have an impact on the energy balance, precipitation, and the hydrological cycle.

Aerosols have different scattering and absorption properties depending on their origin, with the largest radiative contribution coming from aerosols with radii between 0.1 and 1 µm . indicated that the uncertainties of the radiative forcing associated with the aerosol–cloud interactions have not changed over the last four IPCC reports. Understanding the aerosol sources should reduce the uncertainties of their impact. Detailed knowledge of the aerosol sources can also be used to attribute their role to specific processes, evaluate aerosol models, and design better evidence-based air-quality regulations.

Global and local properties of atmospheric aerosols have been extensively observed and measured using both space-borne and ground-based instruments, especially during the last decade. Satellite remote-sensing observations have been exploited to characterize aerosol layers and to assess parameterizations for regional and global models (Amiridis et al.2010). Global networks of sun/sky radiometers, such as AErosol RObotic NEtwork (Holben et al.1998) measure the spectral aerosol optical depth (AOD) (Cattrall et al.2005; Dubovik et al.2002; Hamill et al.2016). The magnitude of the AOD together with the Ångström exponent (i.e. the AOD dependence on the wavelength) can be used to infer the aerosol type, although information about the source is required (Boselli et al.2012; Giles et al.2012). However, the measurements averaged over the entire atmospheric column cannot provide information regarding the vertical distribution of particles.

Active remote-sensing instruments, such as lidars, have been used to distinguish between different aerosol types by providing vertical profiles of aerosol optical properties , as well to understand the three-dimensional structure and variability in time of the aerosol field (Ansmann et al.2010; Freudenthaler et al.2009; Gasteiger et al.2011a, b; Mattis et al.2010). Even if detailed studies of aerosol optical properties have been conducted (Brock et al.2016a, b; Palacios-Peña et al.2018), there are no straightforward links between the optical properties and the aerosol sources given that atmospheric aerosol occurs as a mixture of types (David et al.2013); thus they are difficult to characterize.

Recent advances in atmospheric aerosol measurements have helped to address some of these issues, in particular, to separate different types of aerosols and their mixtures. For example, analysed lidar measurements of aerosol parameters (i.e. lidar ratio, depolarization, backscatter colour ratio, spectral depolarization ratio) collected by the NASA Langley Research Center airborne High Spectral Resolution Lidar (Hair et al.2008) during measurement campaigns over North America. They showed that these parameters vary with location and with the aerosol type and thus can help to distinguish between different types of aerosols (e.g. HSRL measurements indicated lidar ratio can be used to discriminate between ice and dust and spectral particle depolarization to discriminate between urban and biomass-burning aerosols). Another important advancement in the remote sensing of aerosols was the development of ground-based lidar networks, which provide quality-assured optical profiles on a large temporal and spatial scale. One such network is the European Aerosol Research Lidar Network (EARLINET) established in 2000 with the goal of developing a continental database of the temporal and spatial distribution of aerosols. The EARLINET data are not only relevant for climatological studies, but also for special events, with strong aerosol influence, such as Saharan dust outbreaks, forest-fire smoke plumes transported over large areas, photochemical smog, and volcano eruptions . Recent efforts have focused on making complementary use of different instruments such as lidar and sun or sky photometry at combined EARLINET and AERONET stations (Alados-Arboledas et al.2011; Ansmann et al.2002, 2010; Granados-Muñoz et al.2016; Mamouri et al.2012; Müller et al.2010; Perone and Bulizzi2016). Several other approaches have been developed by using the combination of ground-based measurements with airborne HSRLs lidars and satellite data (Burton et al.2012, 2013, 2014, 2015; Groß et al.2013; Kahn and Gaitley2015; Kahn et al.2010; Liu et al.2002; Omar et al.2009; Papagiannopoulos et al.2016; Tesche et al.2009b).

All these studies have revealed the existence of a wide variety of aerosols that are difficult to classify due to a series of drawbacks (e.g. many aerosol types have similar optical properties). Another issue in aerosol classification is the difficulty in correlating their optical properties with their sources. In reality, atmospheric aerosols are mixtures from many sources, and data on pure aerosol types are sparse. To address these issues, systematic measurements and intensive measurement campaigns have been performed using different methods for aerosol typing (Burton et al.2014; Tesche et al.2009a) and complementary information such as trajectory and dispersion models analysis to estimate the origin of aerosols (Stein et al.2016; Stohl et al.2003). Since 2000, EARLINET network has systematically measured the properties of aerosols from different sources over Europe. Intense campaigns, like ACE-Asia (Murayama et al.2003), SAMUM-1 (Tesche et al.2009b), SAMUM-2 (Groß et al.2011), SALTRACE (Groß et al.2015), ChArMEx/EMEP (Granados-Muñoz et al.2016) have helped to understand the optical properties of aerosols (pure dust and mixtures) or anthropogenic aerosols from industrial areas. Furthermore, recent events, like the eruptions of Eyjafjallajökull in 2010 and Grimsvötn in 2011 offered a rare opportunity to perform studies on the optical properties of volcanic aerosols (Mona et al.2012; Sicard et al.2012; Tesche et al.2012).

The multitude of instruments and retrievals resulted in an increasing amount of data on aerosol properties that had to be processed and classified. One possible way of processing large amounts of data, with the aim of distinguishing between different aerosol types, is to exploit artificial neural networks (ANNs). Starting from the premise that the best way to distinguish between certain data (e.g. image recognition, speech recognition, medical diagnosis) is the human experience based on learning and education, the ANNs were developed to solve problems in the same way that a human brain might. An ANN represents a mathematical projection of the brain in which the information propagates as a neural influx and it is analysed. The ANN contain tens to hundreds of neurons divided into multiple layers depending on the data to be classified. The output of the first layer of neurons represents the input to the next layer. The data for analysis must be constrained to a pattern and the ANNs need to learn to identify this pattern. During the learning process, some weights of the connections between neurons are established. Learning in the case ANNs means changing these weights each time that training data are presented to the network. The change is based on the amount of error in the output compared to the expected result. A comprehensive description of the ANNs theory can be found in , , and .

The capability of ANNs in classifying data has been widely proven in many areas of research (Jain et al.2000). Over the last decades, ANNs were used for remote-sensing applications such as radars (Orlandini and Morlini2000), microwave radiometers (Roberts et al.2010), satellite retrievals (Ali et al.2012), multi-angle spectropolarimeters (Di Noia et al.2015), nephelometers (Berdnik and Loikov2016), or multiple sources data sets . In this article, an in-house-developed ANN algorithm for aerosol typing is introduced. The algorithm relies on a set of ANNs which are trained to recognize the aerosol type based on typical lidar data products from EARLINET, i.e. three backscatter coefficients (β) at 1064, 532, and 355 nm, two extinction coefficients (α) at 532 and 355 nm, and, optional, one linear particle depolarization (δ) at 532 nm. To distinguish between different aerosol types and their mixtures, the optical data presented to the ANNs have to be characteristic (i.e. to be independent on the density of the particles). Therefore the $\mathrm{3}\mathit{\beta }+\mathrm{2}\mathit{\alpha }\left(+\mathrm{1}\mathit{\delta }\right)$ lidar data are at first used to compute the intensive properties such as Ångström exponent (AE), colour ratios (CR), colour indexes (CI), and lidar ratios (LR).

The ability of the ANNs to retrieve the aerosol type depends strongly on the physical content and the uncertainty of the optical inputs as well as on the structure of the ANN and the training process, including the extent of the data set used for this purpose. To create a consistent picture of the aerosol types, an aerosol model representing the optical properties of different aerosol was developed. This model is capable of reproducing the observed aerosol properties and thus can be used to construct a representative and statistically relevant synthetic database. This synthetic data set is needed due to sparse observational data sets that are statistically relevant, well characterized, and representative of the whole spectrum of the aerosol types. The aerosol model was constructed to simulate a large number of lidar measurements (i.e. synthetic data set) which were then used as input data to train the ANNs. The output data from ANNs consists of the most probable aerosol type within the identified layers.

This article is organized as follows. The aerosol model that was used to generate the synthetic data set of lidar measurements is described in Sect. 2.1. The synthetic data set is then used as input for the ANNs, the core of the aerosol-typing algorithm, presented in Sect. 2.2. Section 2.3 and 2.4 describe the Neural Network Aerosol Typing Algorithm Based on Lidar Data (NATALI). The comparison between the aerosol model output and the lidar measurements from previous studies is discussed in Sect. 3.1. Section 3.2 describes the performance of the ANNs. The comparison between the EARLINET-CALIPSO classification and NATALI is presented in Sect. 3.3. Finally, Sect. 4 summarizes this article.

Table 1Conventional names of the aerosol types.

Table 2Pure aerosol types and components.

2 Methodology

## 2.1 The aerosol model

An aerosol model was developed to calculate the optical properties of pure aerosols which are generated by a single source (e.g. dust produced by the deserts, marine particles produces by the oceans). In this article, six classes of pure aerosol are considered: continental, continental polluted, dust, marine, smoke, and volcanic (Table 1). The aerosol model combines the Global Aerosol Data Set (Koepke et al.1997) along with the T-matrix numerical method to iteratively compute the intensive optical properties of each aerosol type. The chemical composition of each pure aerosol type was picked up from the OPAC (Optical Properties of Aerosols and Clouds) software package . The chemical composition of each aerosol type was varied in certain limits (the limits are detailed in Table 2 and refer to particle number density mixing ratios) in order to reproduce the large variety of particles present in the atmosphere. The synthetic database developed using the aerosol model is built for 350, 550, and 1000 nm sounding wavelengths. These wavelengths were selected from the 61 wavelengths (0.25–40 µm) of OPAC for which the microphysical characteristics of the aerosols are available from GADS. The selected wavelengths are then rescaled to the usual lidar wavelengths (i.e. 355, 532, and 1064 nm) using an Ångström exponent equal to 1. This was considered a valid assumption for all aerosol types, taking into account the small difference between the lidar and the model wavelengths. If required, the aerosol model can be extended to other wavelengths.

Each pure aerosol type is built as an internal mixture of basic components which do not interact physically or chemically, having different mixing ratios. The basic components are picked up from OPAC: water soluble, insoluble, soot, mineral (nucleation, accumulation, coarse), sulfates, and sea salt (accumulation, coarse). The GADS database is used for the microphysical properties of each component . However, with the current values of the complex refractive index of soot in GADS, values greater than 1.2 for the Ångström exponent (550∕350 nm) cannot be achieved for smoke and continental-polluted types. Based on the findings of and , a typical value of 1.41 was considered for the real part of the refractive index, instead of 1.75 as it is currently in GADS.

In the aerosol model, particles were considered to be spheroids with different aspect ratios (i.e. the ratio of the polar to equatorial lengths) to simulate the aerosol anisotropy (Table 2). Dust and volcanic aerosols were considered oblate (i.e. aspect ratio <1). Also, the proportion of soot was increased to counterbalance for the low hematite (iron oxide) content, consistent with and .

Starting from the microphysical properties (i.e. mode radius, width of the log-normal distribution, number density, density, and mass concentration) of each component, the microphysical properties of the pure aerosol were calculated by varying the critical component in certain limits (i.e. its number density mixing ratio), while the total mixture is normalized to 1 (Table 2). The mixing ratio of the aerosol components is given by

$\begin{array}{}\text{(1)}& {\mathit{\mu }}_{j}=\frac{{N}_{j}}{{N}_{\text{t}}};\phantom{\rule{0.25em}{0ex}}\phantom{\rule{0.25em}{0ex}}\phantom{\rule{0.25em}{0ex}}j=\stackrel{\mathrm{‾}}{\mathrm{1},\text{NC}},\end{array}$

where NC represents the number of components, Nt is the total number of particles, Nj is the number of particles for component j, and the boundary condition is given by

$\begin{array}{}\text{(2)}& \sum _{j}{\mathit{\mu }}_{j}=\mathrm{1};\phantom{\rule{0.25em}{0ex}}\phantom{\rule{0.25em}{0ex}}\phantom{\rule{0.25em}{0ex}}j=\stackrel{\mathrm{‾}}{\mathrm{1},\text{NC}}.\end{array}$

For each wavelength selected in the aerosol model, the real and the imaginary parts of the complex refractive index were determined with the Lorentz–Lorentz model:

$\begin{array}{}\text{(3)}& \frac{{m}_{\text{p}}^{\mathrm{2}}-\mathrm{1}}{{m}_{\text{p}}^{\mathrm{2}}+\mathrm{2}}=\sum _{j}{\mathit{\mu }}_{j}\cdot \frac{{m}_{j}^{\mathrm{2}}-\mathrm{1}}{{m}_{j}^{\mathrm{2}}+\mathrm{2}};\phantom{\rule{0.25em}{0ex}}\phantom{\rule{0.25em}{0ex}}\phantom{\rule{0.25em}{0ex}}j=\stackrel{\mathrm{‾}}{\mathrm{1},\text{NC}},\end{array}$

where mp and mj represent the complex refractive index for the particle and for the j components of the aerosol mixture. The aerosol radius (rp) is calculated with the following equation

$\begin{array}{}\text{(4)}& {r}_{\text{p}}=\sqrt[\mathrm{3}]{\sum _{j}{\mathit{\mu }}_{j}\cdot {\left({r}_{j}^{\text{mod}}\right)}^{\mathrm{3}}};\phantom{\rule{0.25em}{0ex}}\phantom{\rule{0.25em}{0ex}}\phantom{\rule{0.25em}{0ex}}j=\stackrel{\mathrm{‾}}{\mathrm{1},\text{NC}},\end{array}$

where ${r}_{j}^{\text{mod}}$ is the radius of the component j with respect to relative humidity (RH). The aerosol size distribution (n(r)) as a function of aerosol radius (r) assuming mono-modal log-normal distribution is given by

$\begin{array}{ll}\text{(5)}& & n\left(r\right)=\frac{\mathrm{1}}{\sqrt{\mathrm{2}\cdot \mathit{\pi }}\cdot \mathrm{ln}\left({\mathit{\sigma }}_{p}\right)\cdot r}\cdot {e}^{\left[-{\left(\frac{\mathrm{ln}\left(r\right)-\mathrm{ln}\left({r}_{\text{p}}\right)}{\sqrt{\mathrm{2}}\cdot \mathrm{ln}\left(\mathit{\sigma }\right)}\right)}^{\mathrm{2}}\right]},& j=\stackrel{\mathrm{‾}}{\mathrm{1},\text{NC}}\end{array}$

where ${\mathit{\sigma }}_{p}={\sum }_{j}{\mathit{\mu }}_{j}\cdot {\mathit{\sigma }}_{j}$ represents the width of the distribution for aerosols, and σj is the width of the distribution for component j (computed as the standard deviation of the log of the distribution with ${r}_{j}^{\text{mod}}$ mod radius).

Using the calculated microphysical properties with the T-matrix code , the effective cross section for the particle scattering (Csca) and extinction (Cext) as well as the scattering matrix elements (phase functions) were obtained. These parameters are further used to determine (for a single particle) the aerosol optical parameters. The extinction coefficient (α) is determined from Eq. (6) and the backscatter coefficient (β) from Eq. (7), where F11 is the first element of the scattering matrix (phase function).

$\begin{array}{}\text{(6)}& \mathit{\alpha }=\underset{{R}_{\text{min}}}{\overset{{R}_{\text{max}}}{\int }}{C}_{\text{ext}}\cdot n\left(r\right)\text{d}r\end{array}$

$\begin{array}{}\text{(7)}& \mathit{\beta }=\underset{{R}_{\text{min}}}{\overset{{R}_{\text{max}}}{\int }}{C}_{\text{sca}}\cdot \frac{{F}_{\mathrm{11}}\left(\mathrm{180}{}^{\circ }\right)}{\mathrm{4}\cdot \mathit{\pi }}\cdot n\left(r\right)\text{d}r\end{array}$

The integration domain (Rmin:Rmax), for which the effective radius, the extinction coefficient, and scattering coefficient are calculated, covers medium-size particles with a radius between 0.1 and 5.0 µm that contribute to the scattering and extinction of light. The radius was not increased further due to computing time limitations and model design limitations (i.e. the code used for the calculation of the optical parameters for spheroids does not achieve the convergence for non-spherical particles). However, the latter limitation is not considered critical for the range of lidar wavelengths.

The single-scattering albedo ($\stackrel{\mathrm{‾}}{\mathit{\omega }}$) is yielded as the ratio of the scattering and extinction effective cross sections:

$\begin{array}{}\text{(8)}& \stackrel{\mathrm{‾}}{\mathit{\omega }}=\frac{{C}_{\text{sca}}}{{C}_{\text{ext}}}.\end{array}$

The lidar ratio (LR) is determined by the following relationship,

$\begin{array}{}\text{(9)}& \text{LR}=\frac{\mathrm{4}\cdot \mathit{\pi }}{\mathit{\omega }\cdot {F}_{\mathrm{11}}\left(\mathrm{180}{}^{\circ }\right)},\end{array}$

while the particle linear depolarization (δ) is calculated based on the elements of the scattering matrix,

$\begin{array}{}\text{(10)}& \mathit{\delta }=\frac{{F}_{\mathrm{11}}-{F}_{\mathrm{22}}}{{F}_{\mathrm{11}}+{F}_{\mathrm{22}}}.\end{array}$

The algorithm is iterated for each composition, wavelength, and RH value until the entire selected domain is covered. The domain represents the range in which the parameters are varied (e.g. the domain for the wavelength is [350, 500, 1000 nm]; the domain for RH is [50 %, 70 %, 80 %, 90 %]; the domain for the number density mixing ratios for each component of each pure aerosol type is listed in Table 2). The algorithm generates the properties and mixing ratio of each component, the optical and microphysical properties of the aerosol, for each wavelength, each RH value, and each composition.

Four classes of RH (i.e., 50 %, 70 %, 80 %, 90 %) are considered, out of the eight classes in OPAC. The high RH values (i.e. above 90 %) were excluded in order to avoid ambiguous results related to activation of the hygroscopic particles. Dry particles, those with 0 % RH, considered too rarely present in the ambient atmosphere. For a better representation of the particle growth, the OPAC RH classes were linearly interpolated with a 1 % step for pure types and 5 % for mixed types and linearly extrapolated down to 40 %. Thus, within a 40 %–90 % range the hygroscopic growth is considered linear for all pure aerosols included in the model.

Even while considering a certain variation of the aerosol composition and of RH, the simulated optical parameters are not covering the whole range of measured values. This is partly due to the limitations of the model itself and partly due to the various uncertainties associated with the measurements, either due to the instrument (e.g. biases, calibration) or due to the data treatment (e.g. algorithms applied in preprocessing to correct or average raw signals, algorithms used to calculate data products). Optical parameters calculated from lidar measurements are reported in the EARLINET database as the mean value (xmed) and associated uncertainty (absolute error, Δx). Optical parameters calculated from synthetic data do not carry this uncertainty; therefore a fixed relative error was considered, which was multiplied with the value to obtain the absolute error (uncertainty). For the actual retrieval of the aerosol type, any value between (xmed−uncertainty) and (xmed+uncertainty) was possible; therefore the algorithm was applied for all these values with a certain step (i.e. the finesse). The output is a “bundle” of possible aerosol types, with a dimension equal to the finesse. A compromise should be made between the finesse and computing time.

Based on the values reported in the literature (Ansmann et al.2002; Freudenthaler et al.2009), a large uncertainty is associated with the extinction coefficient derived with the Raman method, mainly due to noisy Raman lidar signals (i.e. the relative error reported in the lidar measurements is 30 %–150 % and the fixed relative error considered in the synthetic data is 50 %). Particle depolarization is very sensitive to the calibration, both for the raw signals of the two channels and for the backscattering. Thus, the values for particle depolarization also have a significant uncertainty (i.e. the relative error reported in the lidar measurements is 2 %–50 % and the fixed relative error considered in the synthetic data is 30 %). The backscatter coefficient calculated from the combination of Raman-elastic channels is less sensitive (i.e. the relative error reported in the lidar measurements is 10 %–50 % and the fixed relative error considered in the synthetic data is 20 %). Even in the case of HSRL, for which the extinction and the backscattering are independently calculated, the cross-talk between the Mie and the Rayleigh channels still introduces systematic errors, which are larger for the extinction than for the backscattering.

The relative errors considered here are 50 % for the extinction, 20 % for the backscattering and 30 % for the depolarization. Note that these values were assumed to be inclusive to mimic high-precision but also moderate-precision retrieved parameters. Although for the microphysical inversion the recommended maximum value for the uncertainty of the optical parameters is 20 % , this is not critical for aerosol classification, as long as a relevant number of parameters is provided (e.g. measured lidar ratios and Ångström exponent are required).

Figure 1The generation chain of the synthetic data for the NATALI algorithm.

Table 4 shows the aerosol types considered in this study: six pure aerosols, seven mixtures of two pure aerosols, and two mixtures of three pure aerosols. The mixtures were obtained by linear combination of pure aerosol properties. The mixtures composed of only two pure types were considered not sufficient. For example, transcontinental transport involves at least three types of pure aerosols (e.g. transport from Africa to Europe can result in a mixture of continental, dust, and marine aerosols). Adding marine aerosols drastically changes the optical properties of the mixtures of two pure aerosols. Thus, mixtures of three aerosol types were considered, especially those containing marine types. From the total number of possible mixtures of two and three aerosols (i.e. 35 mixtures), only those that are most frequently observed and can still be distinguished were selected (i.e. 9 mixtures; see Table 4). This selection of mixtures was also a compromise between the time performance of the algorithm and the minimum number of output aerosol types considered significant in atmosphere.

The generated optical properties of pure aerosols and mixtures serve as a basis for the determination of the extinction Ångström exponent (κext) and the backscatter Ångström exponent (κbsca), also referred to as a colour ratio, for each wavelength combination (Fig. 1). Thus, the Ångström exponent is given by the relationship

$\begin{array}{}\text{(11)}& {\mathit{\kappa }}_{\text{ext}}=-\frac{\mathrm{ln}\left({\mathit{\alpha }}_{{\mathit{\lambda }}_{\mathrm{1}}}/{\mathit{\alpha }}_{{\mathit{\lambda }}_{\mathrm{2}}}\right)}{\mathrm{ln}\left({\mathit{\lambda }}_{\mathrm{1}}/{\mathit{\lambda }}_{\mathrm{2}}\right)}.\end{array}$

Similarly, the backscatter Ångström coefficient (colour index) can be determined using the equation

$\begin{array}{}\text{(12)}& {\mathit{\kappa }}_{\text{bsca}}=-\frac{\mathrm{ln}\left({\mathit{\beta }}_{{\mathit{\lambda }}_{\mathrm{1}}}/{\mathit{\beta }}_{{\mathit{\lambda }}_{\mathrm{2}}}\right)}{\mathrm{ln}\left({\mathit{\lambda }}_{\mathrm{1}}/{\mathit{\lambda }}_{\mathrm{2}}\right)}.\end{array}$

After the calculation of the spectral parameters for pure and mixed aerosols, the synthetic data are used as an input for the artificial neural networks.

## 2.2 The architecture artificial neural networks

The ANNs can be calibrated or “trained” for a specific purpose. Here, ANNs are trained to classify aerosols using solely the lidar intensive properties as input data, without any complementary information. The ANNs used here to classify aerosols were developed using NeuroSolutions a neural network development environment. Several ANNs architectures have been explored: Multilayer Perceptron (MLP), Jordan/Elman Network (JE), Generalized Feed Forward Network (GFF), Self-Organizing Feature Maps (SOFM), Recurrent Neural Network (RNN). Each ANN architecture contains several hidden layers and different learning rules. Each layer is composed of a vector of processing elements of identical parameters (e.g. TanhAxon, SigmoidAxon, LinearTanhAxon) with an associated learning rule and learning parameters.

No significant improvement in the classification of the aerosols has been achieved for different types of processing elements on the ANN structure. Thus, TanhAxon were subsequently used. The TanhAxon applies a bias and a hyperbolic tangent function (i.e. tanh) to each neuron in the layer and replaces a part of the tanh by a line with a slope β. The values of each neuron are forced to be in the interval −1 and 1. For TanhAxon the activation function is defined as

$\begin{array}{}\text{(13)}& f\left({x}_{i},{w}_{i}\right)=\mathrm{tanh}\left[{x}_{i}^{\text{lin}}\right],\end{array}$

where ${x}_{i}^{\text{lin}}=\mathit{\beta }{x}_{i}$ and wi is the bias vector.

Table 3Selected types of artificial neural networks and their structures.

Table 4Correspondence between the aerosol types defined in the algorithm, as they can be retrieved by NATALI in high resolution and low resolution.

Supervised training has been used to train ANNs. Thus, sets of input and output parameters have been being successively presented to the networks for around 1000 epochs (i.e. one forward pass and one backward pass of all the training examples) per training cycle. Backpropagation is the most common form for training ANNs with more than one hidden layer. In the case of backpropagation, the weights on input elements are changed based on their previous value and a correction term. This training approach has been used also for the design of the NATALI ANN: the input data being continuously presented to the ANN and the output compared with the known aerosol type from synthetic database in order to adjust the weights until the desired result is achieved. The optimal values of weights and the minimum errors were taken into account for the testing process. The minimum classification errors were 75 % for more than 80 % of the measured data and 75 % for more than 90 % of the synthetic data.

Several learning rules have been tested: momentum, conjugate gradient descent, step, and Levenberg–Marquardt. The momentum learning rule is a simple and efficient approach in comparison with a standard gradient. It provides the gradient descent with some inertia, depending on the momentum parameter, which gives the smoothness of the gradient estimation. The momentum parameter is the same for all processing elements on a layer. The conjugate gradient has no parameters that need to be adjusted (e.g. learning rates, momentum parameter) and is faster and more accurate with respect to the standard backpropagation. The other two rules, the step rule – a type of standard gradient descent algorithm that allows the user to set a default step size for all weights within the activation component – and the Levenberg–Marquardt rule, which gives a numerical solution to the problem of minimizing a non-linear function, were found inadequate for aerosol-typing purpose. The step rule recognized all aerosol types after the first training cycle, but its active performance was low. The Levenberg–Marquardt algorithm was blocked after several epochs.

Figure 2Artificial neural network logical scheme for the NATALI algorithm.

The cross-validation and test set methods have been used to stop the learning process and to assess the performances. Cross-validation monitors the error for a set of data and stops training when this error begins to increase. After a full process of training, in our case 5–10 training cycles, a testing set of data is presented to the ANN and the network output is compared with the known aerosol type from the synthetic database. In total 68 ANN structures have been explored, starting from the simplest (reduced number of hidden layers) to the most complex ones in order to compromise between the minimum possible time of training and testing, and avoiding saturation effects. Examples of six pure, seven double-component mixtures, and two triple-component mixtures obtained within the 68 explored ANN are presented in Table 4. For the selection of the ANNs, the synthetic database has been split randomly into data used to train the ANN (70 % of all synthetic data sets), data used to test ANN (20 % of all synthetic data sets), and data used for validation (10 % of all synthetic data sets). In the training process, data sets are presented to the ANN with the correct answer. The training is performed iteratively until the testing and validation classification errors are below 25 % (Fig. 2). A finer adjustment of the weighting coefficients is done during the testing process. The last 10 % of the data are presented to the ANN without the known result in order to validate the optimum training process and the capability of the network to classify new data inputs.

Three basic ANNs (adjusted to accommodate all data) have been chosen as appropriate to classify the multiwavelength lidar data in parallel, for both high- and low-resolution classification: the Jordan–Elman with 6 or 8 hidden layers, and the generalized feedforward with 10 hidden layers (Table 3). The selected types of ANNs classify the aerosols based on the response with higher confidence (i.e. the probability of having one of the aerosol types). The ANNs have been trained using 3500 samples for each aerosol type and successive training sessions until the best weights are reached (i.e. the classification process is ended, and the classification errors are low).

Figure 3Schematics of the NATALI algorithm for aerosol typing.

## 2.3 The typing algorithm

Following the methodology described in Sect. 2.1 and taking into account the uncertainty threshold of each optical parameter, a bundle of inputs for each measured or simulated aerosol layer was generated. Answers with low confidence are filtered out (e.g. by using a threshold of minimum 0.7 confidence). The correct answer is selected based on a statistical approach considering two criteria: (a) which answer has a higher confidence; (b) which answer is more stable over the uncertainty range.

The input parameters for NATALI are typical data products from EARLINET database: backscatter coefficient (β) profiles at 1064, 532 and 355 nm, extinction coefficient (α) profiles at 532 and 355 nm, and, optionally, linear particle depolarization (δ) profile at 532 nm. The identification of aerosol types is not always possible due to its dependence on the physical content (i.e. with or without δ) and the quality of the optical data (i.e. calibration, uncertainty). For these reasons three classification schemes are used with different aerosol type resolutions (Table 4). First, when particle depolarization is available and all optical parameters are provided with a high-quality (uncertainty of the aerosol extinction coefficient ≤50 %, uncertainty of the aerosol backscatter coefficient ≤20 %, uncertainty of the particle linear depolarization ration ≤30 %), the typing is performed in high resolution (AH). This means that the mixtures can be resolved and the number of outputs is 14 (i.e. pure with minimum 90 %, mixtures of two, and mixtures of three pure aerosol types). Second, when particle depolarization is available and the optical parameters have a high uncertainty (uncertainty of the aerosol extinction coefficient >50 %, uncertainty of the aerosol backscatter coefficient >20 %, uncertainty of the particle linear depolarization ration >30 %), the typing is performed in low resolution (AL). In this case, the number of outputs is six (i.e. pure with maximum 30 % traces of other types). Third, when the particle depolarization is not available, the typing is performed in low resolution, again meaning that the mixtures cannot be resolved. In this case, the predominant aerosol type is retrieved for four outputs (pure with maximum 30 % traces of other types), whereby if only spectral parameters are provided, the volcanic type cannot be distinguished from dust nor continental pollution and are therefore excluded as output.

The three ANNs (Table 3) were developed for three classification schemes (Table 4) to increase the confidence of the aerosol typing. A voting procedure selects the most probable answer out of the three (possibly different) individual returns. The selection is made based on the confidence level of the ANN outputs and stability over the uncertainty range (i.e. the percentage of agreement for values between error limits).

## 2.4 The NATALI code

The Neural Network Aerosol Typing Algorithm based on LIdar data (NATALI) developed in the Python programming language is built on three modules: (a) an input module to prepare the inputs in the specific format of the ANNs, (b) a typing module to run the ANNs and decide on the most probable aerosol type and (c) an output module to save the results and logs. The input module reads the lidar files in EARLINET NetCDF format, checks for the availability of all required parameters (β1064, β532, β355, α532, α355, and optionally δ532 nm), identifies the layer geometrical boundaries and calculates within each layer the mean intensive optical parameters (i.e. Ångström exponent, colour indexes colour ratios, lidar ratios, particle linear depolarization ratio) and their associated uncertainty) (Fig. 3).

The layer boundaries are calculated by applying the gradient method on the 1064 nm backscatter coefficient profile . The inflexion points of the second derivative of the profile data, computed with the Savitzky-Golay filter, give the top and the bottom of the layers. The window size of the cubic Savitzky–Golay filter, which is modified by the user, has a default value of 700 m. The filter was applied twice to obtain the second derivative. A signal-to-noise ratio filter is applied at this point, making sure the ratio is at least 5. The layer boundaries are moved towards the median height until the SNR criteria is met; if the criteria cannot be satisfied with a layer height greater than 300 m, the layer is discarded. A coarse or fine structure of the aerosol layers is revealed by a higher or lower value of the adjustable smoothing parameter (FINESSE). The layers with thicknesses of more than 300 m are considered, whereby the intensive optical properties and their uncertainties are computed for the middle of each layer in the range of at least 200 m thickness to exclude the margins likely affected by the smoothing

Several filters are applied to the data, and only layers which pass the following criteria are further considered for typing:

• availability of all necessary intensive optical parameters,

• values of the intensive optical parameters are between acceptable limits (Table 5),

• the relative error of each intensive optical parameter is lower than 50 %.

Table 5Acceptable limits for the layer average intensive optical parameters.

For each layer and for each intensive optical parameter, the input module generates an adjustable number N of values x with uncertainties (Δx) in the range x−Δx and xx. Data are than scrambled considering that any combination has a similar probability to describe the reality. The cluster of possible combinations of intensive optical parameters is then converted into the ANN input format.

The typing module runs parallel to the ANNs for each data set representing a layer, and applies the voting procedure to identify the most probable aerosol type. In the case that the depolarization is available, the module runs in six parallel ANNs, three for high resolution (i.e. A1H, A2H, A3H) and three for low-resolution typing (i.e. A1L, A2L, A3L). The probable aerosol type is provided by the high-resolution ANNs, while the predominant type is provided by the low-resolution ANNs. As such, if typing in high resolution fails due the data quality, the user still has access to information in low resolution. If the depolarization is not available, the module runs three ANNs (i.e. B1L, B2L, B3L) in parallel and returns only the most probable predominant aerosol type (volcanic overlaps, in all existing parameters, completely with dust or continental-polluted type and cannot be retrieved in low resolution). The output module prepares and saves the files in two formats, csv and human-readable (telegrams) files, and writes a log. The csv files and the telegrams contain the identification of the data sets for which typing is performed and provide the following parameters:

• identification of the data sets for which the typing was performed;

• for each identified layer

• the geometrical top and bottom,

• the intensive optical parameters and associated uncertainties,

• the aerosol type retrieved by each ANN, and the number of agreements,

• the most probable type selected with the voting procedure (in low and high resolution separately if so),

• the type of the ANN delivering the result (i.e. 1, 2, or 3),

• comments generally referring to situations in which optical data did not pass the quality criteria or errors in the retrieval procedure.

Table 6Optical properties of aerosols from the synthetic data set and measurements.

The NATALI code additional information (e.g. run time, run parameters, network error messages) is included in the telegrams. The software structure resembles the three module approach described earlier: an input module (nt_input.py), a typing module (nt_typing.py), and an output module (nt_output.py). The three modules are coordinated by the natali.py script, which contains the high-level algorithm and calls the required module routines/codes.

3 Results

The performances of the algorithm were tested in three steps. Firstly, the outputs of the aerosol model were compared with the literature for the values of the intensive optical parameters for each aerosol type considered in this study (Sect. 3.1). Secondly, the ANNs were selected based on their performances during the learning phase and also by comparison with a known reference (i.e. synthetic data) (Sect. 3.2). Thirdly, the complete NATALI algorithm was tested by comparing the retrieved aerosol types with the EARLINET-CALIPSO classification (Sect. 3.3).

## 3.1 Comparison of the aerosol model with the literature

Synthetic aerosol optical properties, i.e. Ångström exponent (AE550_350), colour ratios (CR550_350 and, CR1000_550)), lidar ratios (LR350 and LR550), and linear particle depolarization ratio at 550 nm (DEP550) generated by the developed aerosol model have been compared with the measured intensive parameters for the six classes of pure aerosol. The comparison with previous literature was only possible for pure types because the properties of mixed aerosols are computed based on a linear progression of the corresponding optical properties for two pure types. As shown in Table 6, the synthetic data are in general in very good agreement with the values reported in previous studies (i.e. the range of synthetic values is between the minimum and maximum values reported in the literature). Synthetic values lower than those observed are for continental-rural (AE550_350), continental-polluted (CR1000_500), and dust (CR1000_500) types. Synthetic values greater than those from the literature are for continental-rural (LR350) and volcanic (DEP550) types. The reasons for these discrepancies are many. In some cases, values reported in the literature have high uncertainties because of natural variability, improper calibration, and retrieval. The aerosol model has also some limitations, e.g. due to spheroidal model and mono-modal log-normal distribution.

Figure 4Characteristic quantities of various atmospheric aerosol types form lidar measurements (ac, adapted from , their Fig. 5) and from synthetic measurements (d–f). (a, d) Lidar ratio versus linear particle depolarization. (b, e) Linear particle depolarization versus colour ratio. (c, f) Colour ratio versus lidar ratio.

When comparing the aerosol model with the results from the previous studies, the changes in OPAC concerning the hygroscopic growth need to be considered (Zieger et al.2013). These changes have not been implemented here, because when this study was conducted the new OPAC hygroscopicity was not available. However, the changes in OPAC are not expected to produce major changes in the aerosol model, considering the large uncertainties introduced to the model to simulate the observations.

In Fig. 4 comparisons between the synthetic data for pure aerosol obtained from the model and the measurements obtained by are provided. Based on the Airborne High Spectral Resolution Lidar (HSRL) data and in situ measurements of aerosol microphysical and optical properties collected during a series of measurement campaigns in 1998 (Lindenberg Aerosol Characterization Experiment, LACE), 2006 (The Saharan Mineral Dust Experiment, Morocco, SAMUM-1), and 2008 (The Saharan Mineral Dust Experiment, Cabo Verde islands, SAMUM-2 and European integrated project on Aerosol Cloud Climate, EUCAARI), developed an aerosol classification scheme for six aerosol types and aerosol mixtures (i.e. Saharan mineral dust, Saharan dust mixtures, Canadian biomass burning aerosol, African biomass burning mixture, anthropogenic pollution aerosol, and marine aerosol). The aerosol typing based on the lidar ratio and the linear depolarization ratio at 550 nm, show, in general, good agreement between the synthetic data and the observations at 532 nm from (Fig. 4a and d), especially for smoke/biomass burning, industrial and marine types. The continental and volcanic aerosols are not represented in the measurements, so were not compared. Dust presents lower values for depolarization for the synthetic data (Fig. 4b and e) but similar values for the lidar ratio (Fig. 4c and f). Clusters were identified both in synthetic and observational data, which means that for pure aerosols the combination of extinction, backscatter, and depolarization at one wavelength could be sufficient for the ANN training.

Figure 5Lidar ratio versus particle linear depolarization ratio. (a) Synthesis of ground-based observations and simulations adapted from (their Fig. 1). Filled stars represent simulations using the components of Aerosol CCI and variations with different refractive indexes and shape distributions (open stars). (b) Synthetic data from the NATALI aerosol model.

provided a synthesis of ground-based observations of lidar ratio and particle linear depolarization at 355 nm for different aerosol types (i.e. dust, smoke, pollution, marine, aerosol, volcanic ash) and mixtures, collected during a series of measurement campaigns, i.e. PollyXT measurements at Cabo Verde , at EARLINET stations of Leipzig and Munich , in the Amazon Basin , and on board Polarstern over the North Atlantic (Fig. 5a). The synthetic data show a wider spread because of large uncertainty accepted for the input parameters. Very high values for the linear depolarization for smoke in the Aerosol CCI (European Space Agency Aerosol Climate Change Initiative) could not be achieved in the aerosol model (Fig. 5b).

When the entire output of the aerosol model is considered (i.e. 14 aerosol types) there is a high overlap between clusters, in particular for mixtures, due to the built-in uncertainty (Fig. 6a). Smoke and continental pollution almost completely overlap (Fig. 6a), which is consisted with measurements reported in literature (Table 6). This makes the typing challenging. The importance of particle depolarization shown relatively recently (Freudenthaler et al.2009) can improve the aerosol typing (Fig. 6b). Particle depolarization contributes to the identification of complex mixtures and to the differentiation between mineral and volcanic particles. The main issue for particle depolarization is calibration, having been recently addressed (Belegante et al.2018; McCullough et al.2017) and thus few data sets satisfy the depolarization ratio quality criteria for aerosol typing. However, even without particle depolarization information, the low-resolution typing can identify the predominant aerosol types in a mixture.

Figure 6Synthetic data set with (a) colour ratio versus lidar ratio, and (b) lidar ratio versus linear particle depolarization ratio using the NATALI classification.

## 3.2 ANN performance

Figure 7 shows the overall performances of the ANNs for the high-resolution typing (i.e. A1H, A2H, A3H) and low-resolution typing (i.e. A1L, A2L, A3L). In high-resolution typing at least 70 % of the aerosol types defined (i.e. 10 out of 14) should be correctly assessed in more than 75 % of the cases with a confidence higher than 0.7. In low-resolution typing at least 70 % of the predominant aerosol types (i.e. 4 out of 5) defined should be correctly assessed in more than 65 % of the cases with a confidence higher than 0.7.

The aerosol type is recognized in more than 96 % of all cases in high-resolution typing (Fig. 7a). The missed cases are, in general, due to the complete overlap between the input parameters. For example, continental smoke is classified as smoke in 22 % of the missed cases (i.e. 1.9 % of the total number of cases); continental dust is classified as dust in 9 % of the missed cases (i.e. 0.3 % of the total number of cases). Note that 33 % of the missed cases (1.2 % of the total number of cases) are classified as unknown.

The predominant aerosol is recognized in more than 91 % of the cases in low-resolution typing (Fig. 7b). Most of the missed cases are due to the ANNs not being able to distinguish between continental aerosol and smoke, and continental-polluted aerosol (i.e 36 % of the missed cases representing 3.2 % of the total number of cases), and between continental, smoke and marine aerosols, and continental-polluted aerosols (i.e. 35 % of the missed cases, 3.1 % of all cases). Continental-polluted and marine aerosols are sometimes identified by the ANNs as continental (i.e. 27 % of all missed cases, 2.4 % of the total number of cases).

Figure 7Performances of ANNs for (a) high-resolution typing and (b) low-resolution typing for each ANN (i.e. A1H, A2H, A3H) and the combine results (vote) of the three ANNs. The intervals of ANN confidence levels are shaded according to the scale.

A3H and A3L were the best-performing ANNs but did not always have a high confidence level (Fig. 7). A2H and A2L have the lowest performances, but they can help in certain cases, for example in recognizing continental-dust aerosols. The voting procedure does not always provide the right answer, for example when A3H provides the correct typing but its confidence level is low.

The dependence of the aerosol typing on RH shows that the performances of the ANNs are decreased with an increase in RH, only for continental-smoke and continental-dust for high-resolution typing (Fig. 8) and for continental smoke and mixed smoke for low-resolution typing (Fig. 8). Pure aerosol types are recognized for all values of RH. For coastal polluted, the relative humidity increase results in an increase of typing performance. Overall, lower performances are obtained in low-resolution typing.

Figure 8Performances of the ANNs for different relative humidity values (i.e. 50 %, 70 %, 80 %) for (a) high-resolution typing and (b) low-resolution typing. The intervals of ANN confidence levels are shaded according to scale.

## 3.3 Comparison with EARLINET-CALIPSO classification

Observational data from EARLINET Data Base (https://www.earlinet.org/index.php?id=earlinet_homepage, last access: 18 September 2018), related to the CALIPSO (Cloud-aerosol Lidar and Infrared Pathfinder Satellite Observation) overpasses over different EARLINET observational sites, were compared with the synthetic data obtained from the aerosol model. The EARLINET-CALIPSO database , covers the data of 2000–2018 and includes a total of 718 cases and 21 aerosol and cloud types. Only 13 of these cases contained all of the necessary parameters (i.e. 3 backscatters, 2 extinctions and 1 depolarization). In general, the missing parameter is the particle depolarization. To increase the number of cases, the particle depolarization was added assuming values reported in literature as typical for the corresponding aerosol type. This way, 105 cases containing all needed parameters were obtained. The cases for which all parameters were within 20 % of relative error were selected (63 cases), whereby 57 corresponded to known aerosol types.

Additionally, profiles available at the EARLINET site in Bucharest/Măgurele, established by the Romanian National Institute for Research and Development of Optoelectronics (INOE), were used to increase the validation measurement sample. The INOE database contains 464 measurement sets performed with the multiwavelength Raman depolarization Lidar (Belegante et al.2011) between June 2012 and September 2014. About 44.6 % of measurements were conducted at night-time (including the Raman-derived extinction coefficient profiles). Out of these, 871 processed layers containing backscattering, extinction and particle depolarization profiles averaged over 1 h. Only layers with significant loads (i.e. layers for which the uncertainty of the retrieved optical parameters is below the limits accepted by the algorithm) were selected, for which all intensive parameters were retrieved with accuracies higher than 20 %. Mean values within each layer were computed, excluding the edges of the layers, where the smoothing introduces large errors due to the high gradients. For each layer, the Ångström exponent, colour ratio, colour index, lidar ratio, and linear particle depolarization ratio were computed. Thresholds were then used to estimate the type of aerosol at first glance, which resulted in a data set with 311 layers accepted by the algorithm, of which for only 182 layers of auxiliary data were available. Auxiliary data were used to compare the results of the typing.

Figure 9Results of the aerosol typing from NATALI aerosol model (synthetic data) and observations (observational data, EARLINET-CALIPSO database and additional data sets collected at the EARLINET station in Bucharest). (a, b) Lidar ratio and particle depolarization (VIS), (c, d) Ångström exponent and particle depolarization (VIS), and (e, f) lidar ratio (VIS) and lidar ratio (UV).

The time series of lidar measurements (532 nm volume depolarization and 355, 532, and 1064 nm range corrected signals) were used to identify the aerosol layers. The identification of the aerosol source was based on 96 h backward trajectories using HYSPLIT . The source was assumed to originate at the region where the trajectory was closest to the ground, providing guidance for identifying possible emission sources. The rainfall along the trajectory was used as an indicator of likely wet deposition. A synoptic diagnosis of the main meteorological file (e.g. pressure, geopotential height, temperature, relative humidity, wind), based on NCEP/NCAR Reanalysis , was used to confirm the aerosol trajectories and to determine the type of atmospheric circulation, weather regimes, and weather phenomena along the trajectories.

Figure 9 shows the comparison between the aerosol typing based on the aerosol model (synthetic data) and the EARLINET-CALIPSO and INOE database (observed data). The large spread of the measured parameters is caused by the mixtures of three components, incorrect calibration, or inappropriate estimation of aerosol type. On the other hand, the sparse observational data led to apparently incomplete clusters. No conclusions can be drawn for marine aerosols, as they are not represented in the observational data.

Low values are observed in the Ångström exponent for several cases of dust polluted and smoke categories, as well as low values for the 532 nm lidar ratio are seen for several cases of continental and continental-dust types, indicating a small portion of marine particles. This is most likely due to the fact that particles are transported over a short distance above the sea before reaching the target and thus are misclassified. The high values for the Ångström exponent for some of the marine mineral aerosols indicate a mixture with smoke. Values of the Ångström exponent greater than 1.8, measured for smoke and continental-polluted aerosols, failed to be simulated. Actually, in general, the agreement of classification of the simulated data and the real observations is very good, given all limitations discussed.

4 Conclusions

The NATALI algorithm is based on the ability of specialized ANNs to resolve the overlapping values of the intensive optical parameters calculated for each identified layer in the multiwavelength Raman lidar profiles. The ANNs were trained using synthetic data, for which a new aerosol model was developed. Aerosols were considered spheroids and built up using OPAC-defined internal mixtures, with the associated microphysical properties retrieved from GADS. The intensive optical properties obtained from this model were compared to the literature and found to be consistent with the observations. Variability in the optical properties was achieved by considering different numbers of mixing ratios and relative humidities. In addition, the uncertainty of the observations was included as a prerequisite hypothesis in order to match the lidar data. These requirements have added to the complexity of the ANNs selected to make the retrieval because of the significant overlap of the input values for the intensive optical parameters. Although the linear particle depolarization ratio is a crucial parameter in separating aerosol types, the depolarization methodology is still maturing and only a few lidar stations provide this parameter with an acceptable accuracy. Thus, two parallel typing schemes were developed: (a) a high-resolution typing scheme that allows the identification of 14 aerosol mixtures if the LPDR is available in the input data files, and (b) a low-resolution typing scheme that allows the identification of five predominant aerosol types when LPDR is not provided. For each scheme, three ANNs are run simultaneously. Then a voting procedure is applied to select the most probable answer. The ANNs were selected out of 68 tested structures as having the best performances for the aerosol typing. The voting is based on the confidence of the retrieval for each of the three ANNs and the stability of the retrieval over the uncertainty range. A series of tests showed that considering the variation with the RH from the beginning helped to make the retrieval stable for different atmospheric conditions. Also, considering the 50 % uncertainty for the input data gave realistic retrievals or aerosols, making possible the retrieval of aerosol types when using medium-quality lidar data, which is currently the case for research lidar networks. Without depolarization, the retrieval is much less certain, especially for mixtures, and questionable results were flagged. Spectral characteristics of volcanic aerosols are very similar to those of mineral dust and/or continental polluted, and this type cannot be distinguished if the LPDR is not provided.

The whole algorithm has been integrated into a Python code, available as source code and executable on the NATALI website (http://natali.inoe.ro/resources.html/software, last access: 18 September 2018). The software accommodates lidar profiles – $\mathrm{3}\mathit{\beta }+\mathrm{2}\mathit{\alpha }\left(+\mathrm{1}\mathit{\delta }\right)$ – in the EARLINET data format. The NATALI is user-friendly; a user guide is provided. However, it is important that the user understands the outputs and the limitations of the algorithm; i.e. the results are strongly dependent on the quality input data, and the outputs should be understood accordingly. Although the neural network is able to recognize the pattern of noisy data, the pattern has to be present and correct, otherwise the result of the retrieval will be incorrect. The NATALI algorithm was able to

• recognize the aerosol types (high resolution, 14 types) in more than 70 % of the cases for high-quality optical data (i.e. the uncertainty of the intensive optical parameters of less than 20 %);

• recognize the predominant aerosol types (low resolution, 6 or 5 types) in more than 70 % of the cases for medium and high-quality optical data (i.e. the uncertainty of the intensive optical parameters less than 50 %);

• provide stable responses for RH up to 70 %, and even higher for less hygroscopic aerosols;

• provide results that are comparable in high and low resolution, considering the correspondence of the types defined.

Furthermore, the computing time of the algorithm is relatively short due to the optimization of the Python code. The algorithm has side applications; for example, it can be applied to test the quality of the optical data and to identify incorrect calibration or incorrect cloud screening (Nicolae et al.2018). Blind tests on EARLINET data samples showed the capability of this tool to retrieve the aerosol type from a large variety of data, with different levels of quality and physical content. More complex data sets (e.g. availability of LPDR at 355 and/or 1064 nm) will not produce improvements with the current software because ANNs are specifically trained for $\mathrm{3}\mathit{\beta }+\mathrm{2}\mathit{\alpha }+\mathrm{1}\mathit{\delta }$ data sets. However, the ANNs can be trained with more complete data sets, which can potentially lead to better scores, especially in the case of mixtures. Moreover, a similar approach could be used for any other optical instrument (e.g. photometry) as long as the physical content of the input optical parameters is sufficiently rich.

Code availability
Code availability.

The NATALI (Neural Network Aerosol Typing Algorithm Based on Lidar Data) software – developed by Doina Nicolae, Jeni Vasilescu, Camelia Talianu, Ioannis Binietoglou, and Victor Nicolae – is available with a user guide from http://natali.inoe.ro/resources.html/software (last access: 18 September 2018).

Author contributions
Author contributions.

DN carried out the research design and developed the aerosol-typing algorithm. JV designed the artificial neural networks and conducted the statistical analysis of the output. CT developed the aerosol model. IB carried out the comparison of the results with the previous research and the testing of the typing algorithm. VN developed the code for the aerosol-typing algorithm. All the authors participated in the interpretation of the results and the writing and editing process.

Competing interests
Competing interests.

The authors declare that they have no conflict of interest.

Special issue statement
Special issue statement.

Acknowledgements
Acknowledgements.

The work presented in this paper was performed in the frame of the project Neural network Aerosol Typing Algorithm based on LIdar data (NATALI) funded by ESA under contract 4000110671/14/I-LG. Also, this project has received funding from the European Union's Horizon 2020 research and innovation programme under grant agreement no. 654109 ACTRIS-2, grant agreement no. 692014 ECARS, and Core National Program PN2018 33N/16.03.2018 funded by the Ministry of Research and Innovation. Camelia Talianu was also supported by the Austrian Science Fund FWF, Project M 2031, Meitner-Programm.

Edited by: Vassilis Amiridis
Reviewed by: two anonymous referees

References

Alados-Arboledas, L., Müller, D., Guerrero-Rascado, J. L., Navas-Guzman, F., Perez-Ramirez, D., and Olmo, F. J.: Optical and microphysical properties of fresh biomass burning aerosol retrieved by Raman lidar, and star-and sun-photometry, Geophys. Res. Lett., 38, 1–5, https://doi.org/10.1029/2010GL045999, 2011. a, b

Ali, A., Amin, S. E., Ramadan, H. H., and Tolba, M. F.: Enhancement of OMI aerosol optical depth data assimilation using artificial neural network, Neural Comput. Appl., 23, 2267–2279, https://doi.org/10.1007/s00521-012-1178-9, 2012. a

Amiridis, V., Giannakaki, E., Balis, D. S., Gerasopoulos, E., Pytharoulis, I., Zanis, P., Kazadzis, S., Melas, D., and Zerefos, C.: Smoke injection heights from agricultural burning in Eastern Europe as seen by CALIPSO, Atmos. Chem. Phys., 10, 11567–11576, https://doi.org/10.5194/acp-10-11567-2010, 2010. a

Ansmann, A., Wagner, F., Müller, D., Althausen, D., Herber, A., von Hoyningen-Huene, W., and Wandinger, U.: European pollution outbreaks during ACE 2: Optical particle properties inferred from multiwavelength lidar and star-Sun photometry, J. Geophys. Res., 107, 4259, https://doi.org/10.1029/2001JD001109, 2002. a, b

Ansmann, A., Tesche, M., Groß, S., Freudenthaler, V., Seifert, P., Hiebsch, A., Schmidt, J., Wandinger, U., Mattis, I., Müller, D., and Wiegner, M.: The 16 April 2010 major volcanic ash plume over central Europe: EARLINET lidar and AERONET photometer observations at Leipzig and Munich, Germany, Geophys. Res. Lett., 37, L13810, https://doi.org/10.1029/2010GL043809, 2010. a, b, c, d, e, f, g

Baars, H., Ansmann, A., Althausen, D., Engelmann, R., Heese, B., Müller, D., Artaxo, P., Paixao, M., Pauliquevis, T., and Souza, R.: Aerosol profiling with lidar in the Amazon Basin during the wet and dry season, J. Geophys. Res., 117, D21201, https://doi.org/10.1029/2012JD018338, 2012. a

Belegante, L., Talianu, C., Nemuc, A., and Nicolae, D.: Detection of local weather events from multiwavelength lidar measurements during the EARLI09 campaign, Rom. J. Phys., 56, 484–494, 2011. a

Belegante, L., Nicolae, D., Nemuc, A., Talianu, C., and Derognat, C.: Retrieval of the boundary layer height from active and passive remote sensors, Comparison with a NWP model, Acta Geophys., 62, 276–289, https://doi.org/10.2478/s11600-013-0167-4, 2014. a

Belegante, L., Bravo-Aranda, J. A., Freudenthaler, V., Nicolae, D., Nemuc, A., Ene, D., Alados-Arboledas, L., Amodeo, A., Pappalardo, G., D'Amico, G., Amato, F., Engelmann, R., Baars, H., Wandinger, U., Papayannis, A., Kokkalis, P., and Pereira, S. N.: Experimental techniques for the calibration of lidar depolarization channels in EARLINET, Atmos. Meas. Tech., 11, 1119–1141, https://doi.org/10.5194/amt-11-1119-2018, 2018. a

Berdnik, V. V. and Loikov, V. A.: Neural networks for aerosol particles characterization, J. Quant. Spectrosc. Ra., 184, 135–145, https://doi.org/10.1016/j.jqsrt.2016.06.034, 2016. a

Bishop, C.: Neural Networks for Pattern Recognition, Clarendon Press, 2000. a

Boselli, A., Caggiano, R., Cornacchia, C., Madonna, F., Mona, L., Macchiato, M., Pappalardo, G., and Trippetta, S.: Multi year sunphotometer measurements for aerosol characterization in a Central Mediterranean site, Atmos. Res., 104–105, 98–110, https://doi.org/10.1016/j.atmosres.2011.08.002, 2012. a

Brock, C. A., Wagner, N. L., Anderson, B. E., Attwood, A. R., Beyersdorf, A., Campuzano-Jost, P., Carlton, A. G., Day, D. A., Diskin, G. S., Gordon, T. D., Jimenez, J. L., Lack, D. A., Liao, J., Markovic, M. Z., Middlebrook, A. M., Ng, N. L., Perring, A. E., Richardson, M. S., Schwarz, J. P., Washenfelder, R. A., Welti, A., Xu, L., Ziemba, L. D., and Murphy, D. M.: Aerosol optical properties in the southeastern United States in summer – Part 1: Hygroscopic growth, Atmos. Chem. Phys., 16, 4987–5007, https://doi.org/10.5194/acp-16-4987-2016, 2016a. a

Brock, C. A., Wagner, N. L., Anderson, B. E., Beyersdorf, A., Campuzano-Jost, P., Day, D. A., Diskin, G. S., Gordon, T. D., Jimenez, J. L., Lack, D. A., Liao, J., Markovic, M. Z., Middlebrook, A. M., Perring, A. E., Richardson, M. S., Schwarz, J. P., Welti, A., Ziemba, L. D., and Murphy, D. M.: Aerosol optical properties in the southeastern United States in summer – Part 2: Sensitivity of aerosol optical depth to relative humidity and aerosol parameters, Atmos. Chem. Phys., 16, 5009–5019, https://doi.org/10.5194/acp-16-5009-2016, 2016b. a

Burton, S. P., Ferrare, R. A., Hostetler, C. A., Hair, J. W., Rogers, R. R., Obland, M. D., Butler, C. F., Cook, A. L., Harper, D. B., and Froyd, K. D.: Aerosol classification using airborne High Spectral Resolution Lidar measurements – methodology and examples, Atmos. Meas. Tech., 5, 73–98, https://doi.org/10.5194/amt-5-73-2012, 2012. a, b, c, d, e

Burton, S. P., Ferrare, R. A., Vaughan, M. A., Omar, A. H., Rogers, R. R., Hostetler, C. A., and Hair, J. W.: Aerosol classification from airborne HSRL and comparisons with the CALIPSO vertical feature mask, Atmos. Meas. Tech., 6, 1397–1412, https://doi.org/10.5194/amt-6-1397-2013, 2013. a, b, c, d, e, f, g, h, i, j, k, l, m, n

Burton, S. P., Vaughan, M. A., Ferrare, R. A., and Hostetler, C. A.: Separating mixtures of aerosol types in airborne High Spectral Resolution Lidar data, Atmos. Meas. Tech., 7, 419–436, https://doi.org/10.5194/amt-7-419-2014, 2014. a, b

Burton, S. P., Hair, J. W., Kahnert, M., Ferrare, R. A., Hostetler, C. A., Cook, A. L., Harper, D. B., Berkoff, T. A., Seaman, S. T., Collins, J. E., Fenn, M. A., and Rogers, R. R.: Observations of the spectral dependence of linear particle depolarization ratio of aerosols using NASA Langley airborne High Spectral Resolution Lidar, Atmos. Chem. Phys., 15, 13453–13473, https://doi.org/10.5194/acp-15-13453-2015, 2015. a, b, c

Calvo, A. I., Alves, C., Castro, A., Pont, V., Vicente, A., and Fraile, R.: Research on aerosol sources and chemical composition: Past, current and emerging issues, Atmos. Res., 120–121, 1–28, https://doi.org/10.1016/j.atmosres.2012.09.021, 2013. a

Cattrall, C., Reagan, J., Thome, K., and Dubovik, O.: Variability of aerosol and spectral lidar and backscatter and extinction ratios of key aerosol types derived from selected Aerosol Robotic Network locations, J. Geophys. Res., 110, D10S11, https://doi.org/10.1029/2004JD005124, 2005. a, b, c

David, G., Thomas, B., Nousiainen, T., Miffre, A., and Rairoux, P.: Retrieving simulated volcanic, desert dust and sea-salt particle properties from two/three-component particle mixtures using UV-VIS polarization lidar and T matrix, Atmos. Chem. Phys., 13, 6757–6776, https://doi.org/10.5194/acp-13-6757-2013, 2013. a

Di Noia, A., Hasekamp, O. P., van Harten, G., Rietjens, J. H. H., Smit, J. M., Snik, F., Henzing, J. S., de Boer, J., Keller, C. U., and Volten, H.: Use of neural networks in ground-based aerosol retrievals from multi-angle spectropolarimetric observations, Atmos. Meas. Tech., 8, 281–299, https://doi.org/10.5194/amt-8-281-2015, 2015. a

Dubovik, O., Holben, B. N., Eck, T. F., Smirnov, A., Kaufman, Y. J., King, M. D., Tanre, D., and Slutsker, I.: Variability of absorption and optical properties of key aerosol types observed in worldwide locations, J. Atmos. Sci., 59, 590–608, https://doi.org/10.1175/1520-0469(2002)059<0590:VOAAOP>2.0.CO;2, 2002. a, b

Fan, J., Wang, Y., Rosenfeld, D., and Liu, X.: Review of aerosol-cloud interactions: Mechanisms, significance, and challenges, J. Atmos. Sci., 73, 4221–4252, https://doi.org/10.1175/JAS-D-16-0037.1, 2016. a

Fernández, A. J., Sicard, M., Costa, M. J., Guerrero-Rascado, J. L., Gómez-Amo, J. L., Molero, F., Barragán, R., Bortoli, D., Bedoya-Velásquez, A. E., Utrillas, M. P., Salvador, P., Granados-Muñoz, M. J., Potes, M., Ortiz-Amezcua, P., Martínez-Lozano, J. A., Artíñano, B., Muñoz-Porcar, C., Salgado, R., Román, R., Rocadenbosch, F., Salgueiro, V., Benavent-Oltra, J. A., Rodríguez-Gómez, A., Alados-Arboledas, L., Comerón, A., and Pujadas, M.: February 2017 extreme Saharan dust outbreak in the Iberian Peninsula: from lidar-derived optical properties to evaluation of forecast models, Atmos. Chem. Phys. Discuss., https://doi.org/10.5194/acp-2018-370, in review, 2018. a, b, c

Freudenthaler, V., Esselborn, M., Wiegner, M., Heese, B., Tesche, M., Ansmann, A., Müller, D., Althausen, D., Wirth, M., Fix, A., Ehret, G., Knippertz, P., Toledano, C., Gasteiger, J., M., N. G., and Seefeldner: Depolarization ratio profiling at several wavelengths in pure Saharan dust during SAMUM 2006, Tellus B, 61, 165–179, https://doi.org/10.1111/j.1600-0889.2008.00396.x, 2009. a, b, c, d

Fuzzi, S., Baltensperger, U., Carslaw, K., Decesari, S., Denier van der Gon, H., Facchini, M. C., Fowler, D., Koren, I., Langford, B., Lohmann, U., Nemitz, E., Pandis, S., Riipinen, I., Rudich, Y., Schaap, M., Slowik, J. G., Spracklen, D. V., Vignati, E., Wild, M., Williams, M., and Gilardoni, S.: Particulate matter, air quality and climate: lessons learned and future needs, Atmos. Chem. Phys., 15, 8217–8299, https://doi.org/10.5194/acp-15-8217-2015, 2015. a

Gasteiger, J., Groß, S., Freudenthaler, V., and Wiegner, M.: Volcanic ash from Iceland over Munich: mass concentration retrieved from ground-based remote sensing measurements, Atmos. Chem. Phys., 11, 2209–2223, https://doi.org/10.5194/acp-11-2209-2011, 2011a. a

Gasteiger, J., Wiegner, M., Groß, S., Freudenthaler, V., Toledano, C., Tesche, M., and Kandler, K.: Modeling lidar-relevant optical properties of complex mineral dust aerosols, Tellus B, 63, 725–741, https://doi.org/10.1111/j.1600-0889.2011.00559.x, 2011b. a, b

Gayatri, K., Patade, S., and Prabha, T. V.: Aerosol–Cloud interaction in deep convective clouds over the Indian Peninsula using spectral (bin) microphysics, J. Atmos. Sci., 74, 3145–3166, https://doi.org/10.1175/JAS-D-17-0034.1, 2017. a

Giannakaki, E., Balis, D. S., Amiridis, V., and Zerefos, C.: Optical properties of different aerosol types: seven years of combined Raman-elastic backscatter lidar measurements in Thessaloniki, Greece, Atmos. Meas. Tech., 3, 569–578, https://doi.org/10.5194/amt-3-569-2010, 2010. a, b, c, d, e, f, g, h

Giles, D. M., Holben, B. N., Eck, T. F., Sinyuk, A., Smirnov, A., Slutsker, I., Dickerson, R. R., Thompson, A. M., and Schafer, J. S.: An analysis of AERONET aerosol absorption properties and classifications representative of aerosol source regions, J. Geophys. Res., 117, D17203, https://doi.org/10.1029/2012JD018127, 2012. a

Granados-Muñoz, M. J., Navas-Guzmán, F., Guerrero-Rascado, J. L., Bravo-Aranda, J. A., Binietoglou, I., Pereira, S. N., Basart, S., Baldasano, J. M., Belegante, L., Chaikovsky, A., Comerón, A., D'Amico, G., Dubovik, O., Ilic, L., Kokkalis, P., Muñoz-Porcar, C., Nickovic, S., Nicolae, D., Olmo, F. J., Papayannis, A., Pappalardo, G., Rodríguez, A., Schepanski, K., Sicard, M., Vukovic, A., Wandinger, U., Dulac, F., and Alados-Arboledas, L.: Profiling of aerosol microphysical properties at several EARLINET/AERONET sites during the July 2012 ChArMEx/EMEP campaign, Atmos. Chem. Phys., 16, 7043–7066, https://doi.org/10.5194/acp-16-7043-2016, 2016. a, b

Groß, S., Tesche, M., Freudenthaler, V., Toledano, C., Wiegner, M., Ansmann, A., Althausen, D., and Seefeldner, M.: Characterization of saharan dust, marine aerosols and mixtures of biomass burning aerosols and dust by means of multi-wavelength depolarization- and Raman-measurements during SAMUM-2, Tellus B, 63, 706–724, https://doi.org/10.1111/j.1600-0889.2011.00556.x, 2011. a, b, c, d, e, f, g, h, i, j

Groß, S., Volker, F., Matthias, W., Josef, G., Alexander, G., and Franziska, S.: Dual-wavelength linear depolarization ratio of volcanic aerosols: Lidar measurements of the Eyjafjallajökull plume over Maisach, Germany, Atmos. Environ., 48, 85–96, https://doi.org/10.1016/j.atmosenv.2011.06.017, 2012. a

Groß, S., Esselborn, M., Weinzierl, B., Wirth, M., Fix, A., and Petzold, A.: Aerosol classification by airborne high spectral resolution lidar observations, Atmos. Chem. Phys., 13, 2487–2505, https://doi.org/10.5194/acp-13-2487-2013, 2013. a, b, c, d, e, f, g, h, i, j, k, l, m, n, o, p, q, r, s, t, u

Groß, S., Freudenthaler, V., Schepanski, K., Toledano, C., Schäfler, A., Ansmann, A., and Weinzierl, B.: Optical properties of long-range transported Saharan dust over Barbados as measured by dual-wavelength depolarization Raman lidar measurements, Atmos. Chem. Phys., 15, 11067–11080, https://doi.org/10.5194/acp-15-11067-2015, 2015. a, b, c

Groß, S., Gasteiger, J., Freudenthaler, V., Müller, T., Sauer, D., Toledano, C., and Ansmann, A.: Saharan dust contribution to the Caribbean summertime boundary layer – a lidar study during SALTRACE, Atmos. Chem. Phys., 16, 11535–11546, https://doi.org/10.5194/acp-16-11535-2016, 2016. a, b, c, d

Gupta, P. and Christopher, S. A.: Particulate matter air quality assessment using integrated surface, satellite, and meteorological products: 2. A neural network approach, J. Geophys. Res., 114, D20205, https://doi.org/10.1029/2008JD011497, 2009. a

Hair, J. W., Hostetler, C. A., Cook, A. L., Harper, D. B., Ferrare, R. A., Mack, T. L., Welch, W., Izquierdo, L. R., and Hovis, F. E.: Airborne High Spectral Resolution Lidar for profiling aerosol optical properties, Appl. Optics, 47, 6734–6752, https://doi.org/10.1364/AO.47.006734, 2008. a

Hamill, P., Giordano, M., Ward, C., Giles, D., and Holben, B.: An AERONET-based aerosol classification using the Mahalanobis distance, Atmos. Environ., 140, 213–233, https://doi.org/10.1016/j.atmosenv.2016.06.002, 2016. a

Henriksen, T., Ring, T., Call, D., Eddings, E., and Sarofim, A.: Determination of soot refractive index as a function of height in an inverse diffusion flame, in: 5th US Combustion Meeting, San Diego, California, 25–28 March 2007, 1795–1803, 2007. a

Hess, M., Koepke, P., and Schult, I.: Optical properties of aerosols and clouds: The software package OPAC, B. Am. Meteorol. Soc., 79, 831–844, https://doi.org/10.1175/1520-0477(1998)0792.0.CO;2, 1998. a

Holben, B. N., Eck, T. F., Slutsker, I., Tanré, D., Buis, J. P., Setzer, A., Vermote, E., Reagan, J. A., Kaufman, Y., Nakajima, T., Lavenu, F., Jankowiak, I., and Smirnov, A.: AERONET – A federated instrument network and data archive for aerosol characterization, Remote Sens. Environ., 66, 1–16, https://doi.org/10.1016/s0034-4257(98)00031-5, 1998. a

Jain, A. K., Duin, R. P. W., and Mao, J.: Statistical Pattern Recognition: A Review, IEEE T. Pattern Anal., 22, 4–37, https://doi.org/10.1109/34.824819, 2000. a

Janicka, L., S.Stachlewska, I., Veselovskii, I., and Baars, H.: Temporal variations in optical and microphysical properties of mineral dust and biomass burning aerosol derived from daytime Raman lidar observations over Warsaw, Poland, Atmos. Environ., 169, 162–174, https://doi.org/10.1016/j.atmosenv.2017.09.022, 2017. a, b, c, d, e, f, g

Kahn, R. A. and Gaitley, B. J.: Analysis of global aerosol type as retrieved by MISR, J. Geophys. Res.-Atmos., 120, 4248–4281, https://doi.org/10.1002/2015JD023322, 2015. a

Kahn, R. A., Gaitley, B. J., Garay, M. J., Diner, D. J., Eck, T. F., Smirnov, A., and Holben, B. N.: Multiangle Imaging SpectroRadiometer global aerosol product assessment by comparison with the Aerosol Robotic Network, J. Geophys. Res., 115, D23209, https://doi.org/10.1029/2010JD014601, 2010. a

Kalnay, E., Kanamitsu, M., Kistler, R., Collins, W., Deaven, D., Gandin, L., Iredell, M., Saha, S., White, G., Woollen, J., Zhu, Y., Chelliah, M., Ebisuzaki, W., Higgins, W., Janowiak, J., Mo, K. C., Ropelewski, C., Wang, J., Leetmaa, A., Reynolds, R., Jenne, R., and Joseph, D.: The NCEP/NCAR 40-Year Reanalysis Project, B. Am. Meteorol. Soc., 77, 437–472, https://doi.org/10.1175/1520-0477(1996)077<0437:TNYRP>2.0.CO;2, 1996. a

Kanitz, T., Ansmann, A., Engelmann, R., and Althausen, D.: North-south cross sections of the vertical aerosol distribution over the Atlantic Ocean from multiwavelength Raman/polarization lidar during Polarstern cruises, J. Geophys. Res.-Atmos., 118, 2643–2655, https://doi.org/10.1002/jgrd.50273, 2013. a

Koepke, P., Hess, M., Schult, I., and Shettle, E. P.: Global Aerosol Data Set, Report No. 243, Max-Planck-Institut für Meteorologie, Hamburg, ISSN 0937-1060, 1997. a, b

Liu, Z., Sugimoto, N., and Murayama, T.: Extinction-to-backscatter ratio of Asian dust observed with high-spectral-resolution lidar and Raman lidar, Appl. Opt., 41, 2760–2767, https://doi.org/10.1364/AO.41.002760, 2002. a, b

Mahowald, N.: Aerosol indirect effect on biogeochemical cycles and climate, Science, 334, 794–796, https://doi.org/10.1126/science.1207374, 2011. a

Mahowald, N. M., Scanza, R., Brahney, J., Goodale, C. L., Hess, P. G., Moore, J. K., and Neff, J.: Aerosol deposition impacts on land and ocean carbon cycles, Current Climate Change Reports, 3, 16–31, https://doi.org/10.1007/s40641-017-0056-z, 2017. a

Mamouri, R. E., Papayannis, A., Amiridis, V., Müller, D., Kokkalis, P., Rapsomanikis, S., Karageorgos, E. T., Tsaknakis, G., Nenes, A., Kazadzis, S., and Remoundaki, E.: Multi-wavelength Raman lidar, sun photometric and aircraft measurements in combination with inversion models for the estimation of the aerosol optical and physico-chemical properties over Athens, Greece, Atmos. Meas. Tech., 5, 1793–1808, https://doi.org/10.5194/amt-5-1793-2012, 2012. a

Marinescu, P. J., van den Heever, S., Saleeby, S. M., Kreidenweis, S. M., and DeMott, P. J.: The microphysical roles of lower-tropospheric versus midtropospheric aerosol particles in mature-stage MCS precipitation, J. Atmos. Sci., 74, 3657–3678, https://doi.org/10.1175/JAS-D-16-0361.1, 2017. a

Mattis, I., Siefert, P., Müller, D., Tesche, M., Hiebsch, A., T. Kanitz, T., Schmidt, J., Finger, F., Wandinger, U., and Ansmann, A.: Volcanic aerosol layers observed with multiwavelength Raman lidar over central Europe in 2008–2009, J. Geophys. Res.-Atmos., 115, D00L04, https://doi.org/10.1029/2009JD013472, 2010. a, b, c

McCullough, E. M., Sica, R. J., Drummond, J. R., Nott, G., Perro, C., Thackray, C. P., Hopper, J., Doyle, J., Duck, T. J., and Walker, K. A.: Depolarization calibration and measurements using the CANDAC Rayleigh–Mie–Raman lidar at Eureka, Canada, Atmos. Meas. Tech., 10, 4253–4277, https://doi.org/10.5194/amt-10-4253-2017, 2017. a

Mishchenko, M. I. and Travis, L. D.: T-matrix computations of light scattering by large spheroidal particles, Opt. Commun., 109, 16–21, https://doi.org/10.1016/0030-4018(94)90731-5, 1994. a

Mishchenko, M. I., Travis, L. D., and Mackowski, D. W.: T-matrix computations of light scattering by nonspherical particles: A review, J. Quant. Spectrosc. Ra., 55, 535–575, https://doi.org/10.1016/0022-4073(96)00002-7, 1996. a

Mona, L., Liu, Z., Müller, D., Omar, A., Papayannis, A., Pappalardo, G., Sugimoto, N., and Vaughan, M.: Lidar measurements for desert dust characterization: An overview, Adv. Meteorol., 2012, 356265, https://doi.org/10.1155/2012/356265, 2012. a, b, c, d, e

Marmureanu, L., Vasilescu, J., Nemuc, A., Nicolae, D., and Belegante, L.: Aerosol characterization based on chemical composition and optical properties, Rom. J. Phys., 61, 1635–1650, 2016. a

Marmureanu, L., Vasilescu, J., Ştefănie, H., and Talianu, C.: Chemical and optical characterization of submicronic aerosol sources, Environ. Eng. Manag. J., 16, 2165–2172, 2017. a

Müller, D., Wandinger, U., and Ansmann, A.: Microphysical particle parameters from extinction and backscatter lidar data by inversion with regularization: Theory, Appl. Opt., 38, 2346–2357, https://doi.org/10.1364/ao.38.002346, 1999a. a

Müller, D., Wandinger, U., and Ansmann, A.: Microphysical particle parameters from extinction and backscatter lidar data by inversion with regularization: Simulation, Appl. Opt., 38, 2358–2368, https://doi.org/10.1364/ao.38.002358, 1999b. a

Müller, D., Mattis, I., Wandinger, U., Ansmann, A., Althausen, D., and Stohl, A.: Raman lidar observations of aged Siberian and Canadian forest fire smoke in the free troposphere over Germany in 2003: Microphysical particle characterization, J. Geophys. Res., 110, D17201, https://doi.org/10.1029/2004JD005756, 2005. a, b, c, d

Müller, D., Ansmann, A., Mattis, I., Tesche, M., Wandinger, U., Althausen, D., and Pisani, G.: Aerosol-type dependent lidar ratios observed with Raman lidar, J. Geophys. Res.-Atmos., 112, D16202, https://doi.org/10.1029/2006JD008292, 2007. a, b, c, d, e, f, g, h, i, j, k

Müller, D., Weinzierl, B., Petzold, A., Kandler, K., Ansmann, A., Müller, T., Tesche, M., Freudenthaler, V., Esselborn, M., Heese, B., Althausen, D., Schladitz, A., Otto, S., and Knippertz, P.: Mineral dust observed with AERONET sun photometer, Raman lidar, and in situ instruments during SAMUM 2006: shape-independent particle properties, J. Geophys. Res., 115, D07202, https://doi.org/10.1029/2009JD012520, 2010. a

Murayama, T., Masonis, S. J., Redemann, J., Anderson, T. L., Schmid, B., Livingston, J. M., Russell, P. B., Huebert, B., Howell, S. G., McNaughton, C. S., Clarke, A., Abo, M., Shimizu, A., Sugimoto, N., Yabuki, M., Kuze, H., Fukagawa, S., Maxwell-Meier, K., Weber, R. J., Orsini, D. A., Blomquist, B., Bandy, A., and Thornton, D.: An intercomparison of lidar-derived aerosol optical properties with airborne measurements near Tokyo during ACE-Asia, J. Geophys. Res., 108, 8651, https://doi.org/10.1029/2002JD003259, 2003. a

Murayama, T., Müller, D., Wada, K., Shimizu, A., Sekiguchi, M., and Tsukamoto, T.: Characterization of Asian dust and Siberian smoke with multi-wavelength Raman lidar over Tokyo, Japan in spring 2003, Geophys. Res. Lett., 31, L23103, https://doi.org/10.1029/2004GL021105, 2004. a

Nemuc, A., Vasilescu, J., Talianu, C., Belegante, L., and Nicolae, D.: Assessment of aerosol's mass concentrations from measured linear particle depolarization ratio (vertically resolved) and simulations, Atmos. Meas. Tech., 6, 3243–3255, https://doi.org/10.5194/amt-6-3243-2013, 2013. a

Nicolae, D., Nemuc, A., Müller, D., Talianu, C., Vasilescu, J., Belegante, L., and Kolgotin, A.: Characterization of fresh and aged biomass burning events using multiwavelength Raman lidar and mass spectrometry, J. Geophys. Res.-Atmos., 118, 2956–2965, https://doi.org/10.1002/jgrd.50324, 2013. a

Nicolae, D., Talianu, C., Vasilescu, J., Nicolae, V., and Stachlewska, I. S.: Strengths and limitations of the NATALI code for aerosol typing from multiwavelength Raman lidar observations, The 28th International Laser Radar Conference (ILRC 28), 176, 14123–14146, https://doi.org/10.1051/epjconf/201817605005, 2018. a

Nielsen, M. A.: Neural Networks and Deep Lerning, Determination Press, 2015. a

Noh, Y., Kim, Y., Choi, B., and Murayama, T.: Aerosol lidar ratio characteristics measured by a multi-wavelength Raman lidar system at Anmyeon Island, Korea, Atmos. Res., 86, 76–87, https://doi.org/10.1016/j.atmosres.2007.03.006, 2007. a

Omar, A. H., Winker, D. M., Vaughan, M. A., Hu, Y., Trepte, C. R., Ferrare, R. A., Lee, K. L., Hostetler, C. A., Kittaka, C., Rogers, R. R., Kuehn, R. E., and Liu, Z.: The CALIPSO Automated Aerosol Classification and Lidar Ratio Selection Algorithm, J. Atmos. Ocean. Tech., 26, 1994–2014, https://doi.org/10.1175/2009jtecha1231.1, 2009. a

Orlandini, S. and Morlini, I.: Artificial neural network estimation of rainfall intensity from radar observations, J. Geophys. Res., 105, 24849–24861, https://doi.org/10.1029/2000JD900408, 2000. a

Ortiz-Amezcua, P., Guerrero-Rascado, J. L., Granados-Muñoz, M. J., Benavent-Oltra, J. A., Böckmann, C., Samaras, S., Stachlewska, I. S., Janicka, L., Baars, H., Bohlmann, S., and Alados-Arboledas, L.: Microphysical characterization of long-range transported biomass burning particles from North America at three EARLINET stations, Atmos. Chem. Phys., 17, 5931–5946, https://doi.org/10.5194/acp-17-5931-2017, 2017. a

Palacios-Peña, L., Baró, R., Baklanov, A., Balzarini, A., Brunner, D., Forkel, R., Hirtl, M., Honzak, L., López-Romero, J. M., Montávez, J. P., Pérez, J. L., Pirovano, G., San José, R., Schröder, W., Werhahn, J., Wolke, R., Žabkar, R., and Jiménez-Guerrero, P.: An assessment of aerosol optical properties from remote-sensing observations and regional chemistry–climate coupled models over Europe, Atmos. Chem. Phys., 18, 5021–5043, https://doi.org/10.5194/acp-18-5021-2018, 2018. a

Papagiannopoulos, N., Mona, L., Alados-Arboledas, L., Amiridis, V., Baars, H., Binietoglou, I., Bortoli, D., D'Amico, G., Giunta, A., Guerrero-Rascado, J. L., Schwarz, A., Pereira, S., Spinelli, N., Wandinger, U., Wang, X., and Pappalardo, G.: CALIPSO climatological products: evaluation and suggestions from EARLINET, Atmos. Chem. Phys., 16, 2341–2357, https://doi.org/10.5194/acp-16-2341-2016, 2016. a

Papayannis, A., Amiridis, V., Mona, L., Tsaknakis, G., Balis, D., Bösenberg, J., Chaikovski, A., DeTomasi, F., Grigorov, I., Mattis, I., Mitev, V., Müller, D., Nickovic, S., Pérez, C., Pietruczuk, A., Pisani, G., Ravetta, F., Rizi, V., Sicard, M., Trickl, T., Wiegner, M., Gerding, M., Mamouri, R. E., D'Amico, G., and Pappalardo, G.: Systematic lidar observations of Saharan dust over Europe in the frame of EARLINET (2000–2002), J. Geophys. Res., 113, D10204, https://doi.org/10.1029/2007JD009028, 2008. a, b

Pappalardo, G., Wandinger, U., Mona, L., Hiebsch, A., Mattis, I., Amodeo, A., Ansmann, A., Seifert, P., Linné, H., Apituley, A., Arboledas, L. A., Balis, D., Chaikovsky, A., D'Amico, G., De Tomasi, F., Freudenthaler, V., Giannakaki, E., Giunta, A., Grigorov, I., Iarlori, M., Madonna, F., Mamouri, R., Nasti, L., Papayannis, A., Pietruczuk, A., Pujadas, M., Rizi, V., Rocadenbosch, F., Russo, F., Schnell, F., Spinelli, N., Wang, X., and Wiegner, M.: EARLINET correlative measurements for CALIPSO: First intercomparison results, J. Geophys. Res., 115, D00H19, https://doi.org/10.1029/2009JD012147, 2010. a

Pappalardo, G., Amodeo, A., Apituley, A., Comeron, A., Freudenthaler, V., Linné, H., Ansmann, A., Bösenberg, J., D'Amico, G., Mattis, I., Mona, L., Wandinger, U., Amiridis, V., Alados-Arboledas, L., Nicolae, D., and Wiegner, M.: EARLINET: towards an advanced sustainable European aerosol lidar network, Atmos. Meas. Tech., 7, 2389–2409, https://doi.org/10.5194/amt-7-2389-2014, 2014. a

Perone, M. and Bulizzi, P.: Mediterranean aerosol typing by integrating three-wavelength lidar and sun photometer measurements, Environ. Sci. Pollut. Res., 23, 14123–14146, https://doi.org/10.1007/s11356-016-6575-7, 2016. a, b, c

Perone, M. and Bulizzi, P.: Mediterranean aerosol typing by integrating three-wavelength lidar and sun photometer measurements, Environ. Sci. Pollut. Res., 23, 14123–14146, https://doi.org/10.1007/s11356-016-6575-7, 2016. a

Picton, P.: Neural Networks for Pattern Recognition, 2nd edn. Palgrave, 2000. a

Roberts, J. B., Clayson, C. A., Robertson, F. R., and Jackson, D. L.: Predicting near-surface atmospheric variables from Special Sensor Microwave/Imager using neural networks with a first-guess approach, J. Geophys. Res., 115, D19113, https://doi.org/10.1029/2009JD013099, 2010. a

Sakai, T., Shibata, T., Kwon, S. A., Kim, Y. S., Tamura, K., and Iwasaka, Y.: Free tropospheric backscatter, depolarization ratio, and relative humidity measured with the Raman lidar at Nagoya in 1994–1997: Contributions of aerosols from the Asian Continent and the Pacific Ocean, Atmos. Environ., 34, 431–442, https://doi.org/10.1016/S1352-2310(99)00328-3, 2002. a

Samaras, S., Nicolae, D., Böckmann, C., Vasilescu, J., Binietoglou, I., Labzovskii, L., Toanca, F., and Papayannis, A.: Using Ramanlidar-based regularized microphysical retrievals and Aerosol Mass Spectrometer measurements for the characterization of biomass burning aerosols, J. Comput. Phys., 299, 156–174, https://doi.org/10.1016/j.jcp.2015.06.045, 2015. a

Satheesh, S. K. and Krishna, K.: Radiative effects of natural aerosols: A review, Atmos. Environ., 39, 2089–2110, https://doi.org/10.1016/j.atmosenv.2004.12.029, 2005. a

Schnaiter, M., Horvath, H., Möhler, O., Naumann, K. H., Saathoff, H., and Schöck, O. W.: UV-VIS-NIR spectral optical properties of soot and soot-containing aerosols, J. Aerosol Sci., 34, 1421–1444, https://doi.org/10.1016/S0021-8502(03)00361-6, 2003. a

Seinfeld, J. H., Bretherton, C., Carslaw, K. S., Coe, H., DeMott, P. J., Dunlea, E. J., Feingold, G., Ghan, S., Guenther, A. B., Kahn, R., Kraucunas, I., Kreidenweis, S. M., Molina, M. J., Nenes, A., Penner, J. E., Prather, K. A., Ramanathan, V., Ramaswamy, V., Rasch, P. J., Ravishankara, A. R., Rosenfeld, D., Stephens, G., and Wood, R.: Improving our fundamental understanding of the role of aerosol-cloud interactions in the climate system, P. Natl. Acad. Sci. USA, 113, 5781–5790, https://doi.org/10.1073/pnas.1514043113, 2016. a, b

Sicard, M., Guerrero-Rascado, J. L., Navas-Guzmán, F., Preißler, J., Molero, F., Tomás, S., Bravo-Aranda, J. A., Comerón, A., Rocadenbosch, F., Wagner, F., Pujadas, M., and Alados-Arboledas, L.: Monitoring of the Eyjafjallajökull volcanic aerosol plume over the Iberian Peninsula by means of four EARLINET lidar stations, Atmos. Chem. Phys., 12, 3115–3130, https://doi.org/10.5194/acp-12-3115-2012, 2012. a, b, c, d

Stachlewska, I. S., Zawadzka, O., and Engelmann, R.: Effect of Heat Wave Conditions on Aerosol Optical Properties Derived from Satellite and Ground-Based Remote Sensing over Poland, Remote Sens., 9, 1199, https://doi.org/10.3390/rs9111199, 2017. a

Stachlewska, I. S., Samson, M., Zawadzka, O., Harenda, K. M., Janicka, L., Poczta, P., Szczepanik, D., Heese, B., Wang, D., Borek, K., Tetoni, E., Proestakis, E., Siomos, N., Nemuc, A., Chojnicki, B. H., Markowicz, K. M., Pietruczuk, A., Szkop, A., Althausen, D., Stebel, K., Schuettemeyer, D., and Zehner, C.: Modification of local urban aerosol properties by long-range transport of biomass burning aerosol, Remote Sens., 10, 412, https://doi.org/10.3390/rs10030412, 2018. a, b, c

Stein, A. F., Draxler, R. R., Rolph, G. D., Stunder, B. J. B., Cohen, M. D., and Ngan, F.: NOAA's HYSPLIT atmospheric transport and dispersion modeling system, B. Am. Meteorol. Soc., 96, 2059–2077, https://doi.org/10.1175/BAMS-D-14-00110.1, 2016. a, b

Stohl, A., Forster, C., Eckhardt, S., Spichtinger, N., Huntrieser, H., Heland, J., Schlager, H., Wilhelm, S., Arnold, F., and Cooper, O.: A backward modelling study of intercontinental pollution transport using aircraft measurements, J. Geophys. Res., 108, 4370, https://doi.org/10.1029/2002JD002862, 2003. a

Taylor, M., Kazadzis, S., Tsekeri, A., Gkikas, A., and Amiridis, V.: Satellite retrieval of aerosol microphysical and optical parameters using neural networks: a new methodology applied to the Sahara desert dust peak, Atmos. Meas. Tech., 7, 3151–3175, https://doi.org/10.5194/amt-7-3151-2014, 2014. a

Tesche, M., A., A., Müller, D., Althausen, D., Engelmann, R., Freudenthaler, V., and Groß, S.: Vertically resolved separation of dust and smoke over Cape Verde by using multiwavelength Raman and polarization lidars during Saharan Mineral Dust Experiment 2008, J. Geophys. Res., 114, D13202, https://doi.org/10.1029/2009JD011862, 2009a. a

Tesche, M., Ansmann, A., Müller, D., Althausen, D., Mattis, I., Heese, B., Freudenthaler, V., Wiegner, M., Esselborn, M., Pisani, G., and Knippertz, P.: Vertical profiling of Saharan dust with Raman lidars and airborne HSRL in southern Morocco during SAMUM, Tellus B, 61, 144–164, https://doi.org/10.1111/j.1600-0889.2008.00390.x, 2009b. a, b, c, d, e, f

Tesche, M., Müller, D., Gross, S., Ansmann, A., Althausen, D., Freudenthaler, V., Weinzierl, B., Veira, A., and Petzold, A.: Optical and microphysical properties of smoke over Cape Verde inferred from multiwavelength lidar measurements, Tellus B, 63, 677–694, https://doi.org/10.3402/tellusb.v63i4.16362, 2011. a, b, c, d

Tesche, M., Glantz, P., Johansson, C., Norman, M., Hiebsch, A., Ansmann, A., Althausen, D., Engelmann, R., and Seifert, P.: Volcanic ash over Scandinavia originating from the Grímsvötn eruptions in May 2011, J. Geophys. Res., 117, D09201, https://doi.org/10.1029/2011JD017090, 2012. a

Trippetta, S., Sabia, S., and Caggiano, R.: Fine aerosol particles (PM1): Natural and anthropogenic contributions and health risk assessment, Air Qual. Atmos. Hlth., 9, 621–629, https://doi.org/10.1007/s11869-015-0373-0, 2016. a

Vaughan, G., Draude, A. P., Ricketts, H. M. A., Schultz, D. M., Adam, M., Sugier, J., and Wareing, D. P.: Transport of Canadian forest fire smoke over the UK as observed by lidar, Atmos. Chem. Phys., 18, 11375–11388, https://doi.org/10.5194/acp-18-11375-2018, 2018. a

Veselovskii, I., Dubovik, O., Kolgotin, A., Lapyonok, T., Di Girolamo, P., Summa, D., Whiteman, D. N., Mishchenko, M., and Tanré, D.: Application of randomly oriented spheroids for retrieval of dust particle parameters from multiwavelength lidar measurements, J. Geophys. Res., 115, D21203, https://doi.org/10.1029/2010JD014139, 2010. a, b, c

Vlăduţescu, D. V., Wu, Y., Gross, B., Charles, L., Moshary, F., and Ahmed, S.: Aerosol hygroscopic models based on in situ measurements and lidar retrievals, in: 5th WSEAS Int. Conf. on Environment, Ecosystems and Development, Tenerife, Spain, 14–16 December 2007.  a, b

Wandinger, U., Müller, D., Böckmann, C., Althausen, D., Matthias, V., Bösenberg, J., Weiß, V., Fiebig, M., Wendisch, M., Stohl, A., and Ansmann, A.: Optical and microphysical characterization of biomass-burning and industrial pollution aerosols from multiwavelength lidar and aircraft measurements, J. Geophys. Res., 107, D218125, https://doi.org/10.1029/2000JD000202, 2002. a

Wandinger, U., Baars, H., Engelmann, R., Hünerbein, A., Horn, S., Kanitz, T., Donovan, D., van Zadelhoff, G., Daou, D., Fischer, J., von Bismarck, J., Filipitsch, F., Docter, N., Eisinger, M., Lajas, D., and Wehr, T.: HETEAC: The Aerosol Classification Model for EarthCARE, EPJ Web of Conferences, 119, 01004, https://doi.org/10.1051/epjconf/201611901004, 2016. a, b

Wang, W., Gong, W., Mao, F., Pan, Z., and Liu, B.: Measurement and Study of Lidar Ratio by Using a Raman Lidar in Central China, Int. J. Environ. Res., 13, 508, https://doi.org/10.3390/ijerph13050508, 2016. a

Waterman, P. C.: Symmetry, unitarity, and geometry in electromagnetic scattering, Phys. Rev. D, 3, 825–839, https://doi.org/10.1103/PhysRevD.3.825, 1971. a

Zieger, P., Fierz-Schmidhauser, R., Weingartner, E., and Baltensperger, U.: Effects of relative humidity on aerosol light scattering: results from different European sites, Atmos. Chem. Phys., 13, 10609–10631, https://doi.org/10.5194/acp-13-10609-2013, 2013. a