Retrieval of aerosol optical depth from surface solar radiation measurements using machine learning algorithms , nonlinear regression and a radiative transfer based look-up table

In order to have a good estimate of the current forcing by anthropogenic aerosols knowledge on past aerosol levels is needed. Aerosol optical depth (AOD) is a good measure for aerosol loading. However, dedicated measurements of AOD are only available from 1990’s onward. One option to lengthen the AOD time series beyond 1990’s is to retrieve AOD from surface solar radiation (SSR) measurements done with pyranometers. In this work, we have evaluated several inversion methods designed for this task. We compared a look-up table method based on radiative transfer modelling, a nonlinear regression method and four machine learning methods (Gaussian Process, Neural Network, Random Forest and Support Vector Machine) with AOD observations done with a sun photometer at an Aerosol Robotic Network (AERONET) site in Thessaloniki, Greece. Our results show that most of the machine learning methods produce AOD estimates comparable to the look-up table and nonlinear regression methods. All of the applied methods produced AOD values that corresponded well to the AERONET observations with the lowest correlation coefficient value being 0.87 for the Random Forest method. While many of the methods tended to slightly overestimate low AODs and underestimate high AODs, Neural network and support vector machine showed overall better correspondence for the whole AOD range. The differences in producing both ends of the AOD range seem to be caused by differences in the aerosol composition. High AODs were in most cases those with high water vapour content which might affect the aerosol single scattering albedo (SSA) through uptake of water into aerosols. Our study indicates that machine learning methods benefit from the fact that they do not constrain the aerosol SSA in the retrieval where as the LUT method assumes a constant value for it. This would also mean that machine learning methods could have potential in reproducing AOD from SSR even though SSA would have changed during the observation period. 24 25 26


Introduction
The Fifth Assessment Report of the Intergovernmental Panel on Climate Change states that the most significant source of uncertainty in the projections of climate is related to aerosols (IPCC, 2013).One significant contribution to this uncertainty comes from the fact that without the knowledge of the aerosol burden in the past, we are not able to estimate Published by Copernicus Publications on behalf of the European Geosciences Union.
the current forcing of anthropogenic aerosol.For example, the effect of changes in the current aerosol emissions on climate depends on the background aerosol load during the preindustrial era (e.g.Andreae and Rosenfeld, 2008;Carslaw et al., 2013).In addition, the current estimates of past aerosol emissions are highly uncertain (Granier et al., 2011), thus increased knowledge on historical aerosol levels would increase our ability to estimate the present day aerosol radiative forcing.
One limiting factor in determining the properties of global aerosol in the past has been that observations of aerosol radiative effects have been limited to fairly recent periods.For example, the aerosol optical depth has mainly been measured using sun photometers and the most widely known ground-based network of sun photometers is Aerosol Robotic Network (AERONET; Holben et al., 1998).Although, AERONET already contains over 700 stations globally, with a fairly good spatial coverage compared to many other observation networks, it still lacks in temporal coverage, having provided aerosol optical properties and AOD only since 1990s and reaching the current status in recent years.The earliest records of satellite-based AOD are provided by TOMS (total ozone mapping spectrometer, e.g.Torres et al., 2002) and AVHRR (Advanced Very High Resolution Radiometer, Geogdzhayev et al., 2005) from 1979 and 1983 onwards respectively.However, neither one of these instruments were specifically designed to retrieve aerosol properties.The more recent dedicated aerosol sounders, such as ATSR (The Along Track Scanning Radiometer 2, Llewellyn-Jones and Remedios, 2012), MODIS (Moderate Resolution Imaging Spectroradiometer, Levy et al., 2010), VIISR (Visible Infrared Imaging Radiometer Suite, Jackson et al., 2013) and MISR (Multi-angle Imaging SpectroRadiometer, Kahn and Gaitley, 2015) offer data from 1995, 2000 and 2002 onwards respectively.It is therefore apparent that neither sun photometer nor satellite records of AOD are available for all the decades where industrialization has had a significant effect on the aerosol load.
There have been, however, recent studies where aerosol load has been indirectly retrieved from global surface solar radiation (SSR) or separately from direct and diffuse radiation measurements, which cover much longer time periods than sun photometer and satellite observations of AOD.Recently, Kudo et al. (2011) and Lindfors et al. (2013) used radiation measurements taken with pyranometers and pyrheliometers to estimate AOD.Lindfors et al. (2013) demonstrated that AOD can be estimated by using SSR and water vapour information and a look-up table (LUT) generated with a radiative transfer code.Their method produces AOD estimates that have 2/3 of the results within ±20 or ±0.05 % of collocated AERONET AODs.Because pyranometer SSR measurements have been since 1950s over the globe, the usage of AOD estimates based on SSR measurements would enable us to construct AOD time series that go several decades back in time.
Since the 1990s machine learning methods have made their way to atmospheric sciences and have been used e.g. in satellite data processing, climate modelling and weather prediction (Hsieh, 2009).Because of their ability to retrieve parameters from data that have strongly non-linear relationships, they have the potential to retrieve AOD from a combination of solar radiation measurements and auxiliary data such as water vapour content (WVC) and solar zenith angle (SZA), similarly to what was done by Lindfors et al. (2013) using a radiative transfer-based approach.The aim of the present work is to investigate how well machine learning methods are able to estimate AOD from pyranometer observations by evaluating their performance in comparison with a radiative transfer-based look-up-table approach.We chose four different methods: neural network (NN, McCulloch and Pitts, 1943), random forest (RF, Breiman, 2001), Gaussian process (GP, Santner et al., 2013) and support vector machine (SVM, Smola and Schölkopf, 2004) and compared them against a look-up table and a non-linear regression method (NR, Bates and Watts, 1988).The performance of these methods was evaluated with AERONET AOD observations in Thessaloniki, Greece, after the AOD estimates were derived with SSR observations.Non-linear regression has been successfully used in multiple studies within aerosol and atmospheric sciences (e.g.Huttunen et al., 2014;Ahmad et al., 2013).Of these machine learning methods, neural networks (NNs) have been actively used in different types of applications in atmospheric sciences.For example, it has been applied to retrieve aerosol properties from remote sensing instruments (Olcese et al., 2015;Taylor et al., 2014).Moreover, Foyo-Moreno et al. (2014) uses NNs to indicate that a ratio between solar diffuse radiation and normal direct irradiance is the most adequate parameter for estimating AOD from solar radiation measurements.There have been, however, recent studies where aerosol load has been indirectly retrieved from global surface solar radiation (SSR) or separately from direct and diffuse radiation measurements, which cover much longer time periods than sun photometer and satellite observations of AOD.Recently, Kudo et al. (2011) and Lindfors et al. (2013) used radiation measurements taken with pyranometers and pyrheliometers to estimate AOD.The study by Olcese et al. (2015) is similar to ours in the sense that they use alternative data together with neural network approach in an attempt to retrieve AOD at an AERONET site.In their study, they fill in missing AOD values (e.g.due to cloud cover) at one AERONET station based on trajectories and AOD observed on another site.To our knowledge, the rest of the analysed methods have not been used to retrieve aerosol properties directly from observations.

Data and methods
We compared the ability of several methods to estimate AOD, based on SSR and water vapour measurements (and SZA that can be readily determined for any given time and location) against AERONET AOD measurements at 500 nm (henceforth AOD) taken at Thessalonki, Greece.This site was chosen for this study, because it has all the necessary high quality measurements from a 10-year time period, because it is the same site to which Lindfors et al. (2013) applied their LUT approach.Furthermore, the location has varying aerosol concentrations and relatively high AOD values throughout the year.Lindfors et al., 2013).The calibration of the pyranometer has been confirmed to stay within the quoted manufacturer accuracy (Bais et al., 2013).

AERONET measurements
AERONET is a network of sun and sky scanning radiometers that measure direct sun and sky radiance at several wavelengths, typically centred at 340, 380, 440, 500, 670, 870, 940 and 1020 nm, providing measurements of various aerosol-related properties (Holben et al., 1998).From direct sun measurements we exploited AOD and WVC data.When sky radiance measurements are also included, more detailed aerosol properties such as single scattering albedo (SSA) and asymmetry parameter (gg) can be retrieved (Dubovik et al., 2000).In the evaluation of the machine learning methods we used Level 2.0 (cloud-screened and quality assured) AERONET direct sun measurements of AOD and WVC for Thessaloniki.The Cimel sun photometer is located on the roof of the Physics Department in the close vicinity of the pyranometer discussed above.From the inversion products, to interpret some of our results in more detail, we used level 1.5 (cloud-screened) retrievals.However, when we selected the data from the Level 1.5 inversion product, we applied all the other level 2.0 AERONET criteria except for the AOD threshold.In other words, we applied the same rigorous quality control that is required for Level 2 data, but we only relaxed the requirement for AOD at 440 nm to range from 0.4 to 0.1, in order to have more reliable measurements for our data analysis.

Cloud-screening of the pyranometer measurements and collocation with the AERONET measurements
Cloud screening is a crucial factor in the analysis, thus only contribution of aerosols are considered, not clouds.The SSR data were at first cloud screened in order to ensure that only clear-sky measurements were included in the analysis (see Lindfors et al., 2013, for more details).However, during the analysis of the data it became evident that even after the initial cloud screening, the SSR data still included observations that deviated significantly from the main body of the observations.Since there is a high probability that these outliers in the data were caused e.g. by cloud contamination, we applied additional screening to the data.Thus, we removed the clear outliers of possibly undetected clouds, in our case those observations that deviated by more than ±20 W m −2 from the exponential regression fit where a, b and c are regression constants).This additional screening was applied through regression of SSR against AOD for a given range of SZA (within ±0.5 • ).It has to be noted that these data were only a small fraction of all the data that remained after the cloud screening and it is very unlikely that the additional cloud screening would affect the main results and the conclusions of our study.The SSR values were collocated for each AOD with the ±1 min difference, averaged and finally normalized for the Sun-Earth distance corresponding to 1 January.The training data set for the machine learning methods contained the years 2009-2014 and the validation (verification) data set years 2005-2008.These periods were selected because we wanted to verify whether the methods could provide reasonable AOD estimates for a period other than the training period.The training data set covered approximately 2/3 and the validation data set 1/3 of the whole data.For all methods the input parameters are SSR, WVC and SZA and they produce AOD estimates.Table A1 in Appendix A summarizes the statistics of maximum, minimum, average, SD and median for the input and the output parameters.Table A1 shows that AOD is larger for the validation data set, although the maximum value is larger for the training.

Radiative transfer model based look-up table (LUT)
To retrieve AOD from SSR observations Lindfors et al. (2013) produced a LUT based on radiative transfer simulations.They simulated SSR in different atmospheric conditions by varying AOD, WVC and SZA systematically.They used a single aerosol model for all the simulations, and therefore called their AOD estimate as an effective AOD, which is only a function of SSR, SZA and WVC.Other parameters were assumed as constants, e.g.Ångström Exponent of 1.1, SSA at 500 nm of 0.92 (the SSA's spectral pattern follows the rural background aerosol model by Shettle (1989), where SSA changes from roughly 0.92 at 400 nm to 0.89 at 1000 nm).The asymmetry parameter was assumed wavelength independent with a value of 0.68, while the albedo

Non-linear regression method (NR)
The non-linear regression (NR) is a multivariate analysis method which is used when the dependencies between the study variables are not linear (Bates and Watts, 1988).NR is useful especially when there are physical reasons for believing that the relationship between the response and the predictors follows a particular functional form.Benefits of NR are that it needs only moderate-sized samples of the studied phenomena to give adequately precise results and as an output it gives a simple but not predefined function for prediction.An additional advantage of NR against the other methods presented in this paper is that once the parameters are estimated, they can be used in similar cases without additional training data.In this study we assume that AOD can be estimated as a function of SSR, WVC and SZA.Multiple different formulations for the NR function were tested and the function with the best prediction ability found for this data is given by 1 SSR The coefficients b 0 -b 6 were determined using R-software (R Core Team, 2014) and are shown in Table A2.In this case the model inputs are SSR, WVC and SZA, and the output is AOD.The training is executed with a training algorithm and in this paper the Levenberg-Marquardt algorithm is used (Hagan and Menhaj, 1994).A total of 20 NNs were trained in this case.The NNs differed from each other by the number of neurons in a hidden layer.Five networks with the smallest prediction error within the training data set were selected to the final committee of networks.The final prediction of the NN model was computed as a median of the outputs of all networks in the committee.For more information on NNs see, for example, Bishop (1995).

Random forest (RF)
Random forest is a machine learning technique that may be used for classification and non-linear regression (Breiman, 2001).RF for non-linear regression consists of an ensemble of binary regression trees.Each of these trees is constructed using a randomized training scheme and is essentially a piecewise constant fit to the training data set.The prediction of a RF model is obtained by averaging the regression tree predictions over the whole model ensemble.In this study, the RF implementation from the Scikit-Learn machine learning library (Pedregosa et al., 2011) was used.We used (SSR, WVC, SZA, SSRxWVC, SSWxSZA, WVCxSZA) as the RF model inputs and AOD as the output.A randomized cross-validation scheme was used to find the optimal training parameters for the RF.For more information on RFs see, for example, Friedman et al. (2001).
2.5.3Support vector machine (SVM) Support vector machine (SVM) is a machine learning technique (Vapnik, 1995;Burges, 1998).In this study, we use the standard SVM regression (SVR), the formulation based on the commonly used ε-SVR with radial basis kernel function.For implementing the SVM the libsvm package was used (Chang and Lin, 2011).The objective of ε-SVR is to find a function that has at most ε deviation from the training data set outputs.The training of an ε-SVR model is formulated as a quadratic (convex) optimization problem in which the Vapnik's ε-insensitive loss function is minimized (e.g.Vapnik, 1995).The ε-SVR model has two training parameters that were used to control the training: the regularization parameter, which controls the smoothness of the approximation function (sensitivity to noise) and the parameter ε, which dominates the number of support vectors by governing the accuracy of the approximation function.The determination of SVM control parameters was solved by the means of a grid search.For a more detailed description of the method, the reader is referred to Smola and Schölkopf (2004).

Gaussian process (GP)
Gaussian process (GP) for machine learning is a generic supervised learning method that may be used, for example, for non-linear regression.In GP learning, the function inputs and outputs are treated as Gaussian random variables and the correlations between these variables are modelled.The predictions given by a GP model are computed as conditional probability distributions given the training data and function inputs.As the prediction given by a GP model is a probability distribution, the error estimates for the predicted point estimates are obtained automatically.In this study, the GP implementation from the Scikit-Learn machine learning library was used.The same inputs and output variables as with the RF models were used in the GP training.The best performing correlation function training parameters were sought for using maximum likelihood estimation.A total of 25 GP models were trained.The training of each model was carried out using 2500 training data samples that were randomly sampled from the full training data set.The five best performing GP models were selected into the final GP model committee.The final prediction was computed as the median of the predictions given by the GP models in the committee.For more information on GPs for machine learning see, for example, Welch et al. (1992), Rasmussen and Williams (2006), and Santner et al. (2013).
3 Results  produce as good estimates as the LUT method, they were left out from the more detailed analysis.

Comparison of the methods
Although these methods are able to predict the average AOD with a good accuracy, they differ when we compare their ability to predict different AOD levels.In Fig. 1, the colourbar indicates the absolute number of results in the areas with the interval of 0.01 × 0.01 (vertically and horizontally) for AOD; in addition 1 : 1 lines and linear fits are included.Based on the linear fits, NN appears to have the best agreement with AERONET data for the whole AOD range.As the average and median values of AERONET AOD are 0.240 and 0.207 respectively (Table 1), the main population of the measurements is in the range of moderate AODs.The machine learning methods are obviously weighted to perform best in this range of AODs.However, from Fig. 2, which shows the absolute difference between AERONET and predicted AOD, we can see that LUT and GP tend to significantly underestimate AOD for AODs larger than 0.5, while NN and SVM are able to reach smaller differences with AERONET on average, although with larger overall variabilities than LUT and GP.Although NN and SVM also start to deviate from the observations at higher AODs, these deviations are more modest in a relative sense as can be seen from Fig. 3, which shows the relative difference between the observations and predictions.All the methods overestimate AOD in relative terms when AOD approaches zero (Fig. 3).However, as Fig. 2 demonstrates, the absolute error is systematically very low in the small AOD region (AOD < 0.2).NN and SVM are generalized better for large AODs than the other methods, where the amount of data are small.In Table 1, the four last rows represent the values for cases where the results of machine learning methods are combined by averaging them.As can be seen from the table, these combinations do not improve the estimates compared to the statistical values of individual methods.

3.2
The effect of water vapour on AOD predictions Huttunen et al. (2014) showed that WVC and AOD typically have a positive correlation.Therefore, we investigated how the AOD estimates from different methods are affected by WVC. Figure 4 shows the relative difference between the predictions and measured AOD with respect to WVC.From this figure, we can see that the LUT-based AODs are overestimated at the smallest and underestimated at the largest WVC contents.The reason for this behaviour is that the LUT method has been set to assume prescribed and constant properties for many relevant parameters that affect SSR (other than AOD and WVC); e.g.aerosol single scattering albedo, asymmetry parameter and surface albedo (Lindfors et al., 2013).Consequently, the assumption of constant SSA in particular leads to WVC-dependent systematic bias of the LUTbased AOD, as we will show next.The other methods are closer to the ratio of 1 without such a systematic bias, excluding the SVM underestimation for the smallest WVC.
Figure 5 shows measured SSR and LUT-based SSR for a narrow set of SZAs (48.50-51.50• ).AOD is on the horizontal axis, SSR on the vertical axis and WVC is shown with the colourbar.From Fig. 5a it is evident that LUT incorporates a strong WVC-dependent structure: for a given SSR level, AOD decreases with increasing water vapour content.This pattern follows from the assumption that the aerosol composition remains the same, i.e. it has a fixed SSA value.Thus in the LUT method, increases in SSR absorption by water vapour are compensated by decreases in aerosol extinction.Figure 4.The same as Fig. 3, but the ratio of predicted to measured AOD is given as a function of the water vapour content (WVC).
In the real atmosphere, water vapour content also has implications on aerosol composition and size.If all conditions apart from water vapour remained constant, increase of water vapour would also increase the uptake of water into aerosol particles thus affecting the aerosol SSA.The effect of fixed SSA is also visible in the way the LUT-based AOD estimates are distributed (Fig. 5a).In Figure 5c we can see that for a given AOD in the LUT, the highest WVC values always correspond to the lowest SSR values.However, the same pattern is not clearly visible either in the plot with the measured values (Fig. 5b) or in the plot with AOD from NN (Fig. 5d).This indicates that although the machine learning methods do not explicitly get any information about the possible systematic covariability of WVC and SSA, they seem to be able to detect it indirectly, at least to some extent.To further illustrate this, Fig. 6a shows the AERONET measurements of AOD and single scattering co-albedo, 1-SSA at 500 nm as a function of WVC.Here, together with the absorption strength by the water vapour, we considered more illustrative to show the single scattering co-albedo rather than SSA.In this plot, SZA, SSR and season were limited respectively to 58 • < SZA < 62 • , 420 W m −2 < SSR < 460 W m −2 , June-August, allowing enough data with the limited parameters.Thus, the plot illustrates the co-variability of WVC and SSA for a limited range of surface solar radiation and SZA, for conditions when the LUT method produces lower AOD values for higher WVC (Fig. 5a).However, Fig. 6a clearly shows that an opposite relationship between AOD and WVC is obtained by the measurements.Moreover, this pattern is compensated by aerosol absorption (remember that in this subset we constrained SSR), which decreases with increasing WVC; this is likely related to the aerosol swelling by hygroscopic growth that increases the scattering of the aerosol.Therefore, we can conclude from the measurements that be- cause of the covariability of WVC and SSA in Thessaloniki, the assumption of a fixed SSA in the LUT causes limitations for predicting AOD, while the machine learning methods can take into account, at least to some extent, this relationship indirectly.Using radiative transfer modelling we demonstrated the magnitude of these changes in water vapour and aerosol absorption, as indicated in Fig. 6.Indeed, they induced opposite effects of similar magnitude in surface solar irradiance.
For the base case, we simulated SSR with WVC of 2.8 cm and 1-SSA of 0.06 (with SZA of 60 • and AOD of 0.3) as inputs, resulting in 439.9 W m −2 .When we increased the water vapour column to 3.6 cm, the corresponding decrease in SSR was about 6.8 W m −2 .However, when we additionally decreased the aerosol absorption (1-SSA) to 0.04, the difference to the base case shrank to 1.8 W m −2 and this remaining amount can mostly be explained by the asymmetry parameter, which also exhibits a systematic dependence with WVC (stronger forward scattering by particles grown in humid conditions).The lower panel of Fig. 6 further illustrates the role of fixed SSA in the observed WVC-dependent bias in the LUT results, which can be avoided with the machine learning methods.It shows the mean ratio of LUT-estimated and AERONET-measured AOD on the right-hand side y axis as a function of water vapour content (so essentially the same results shown by a box-plot in Fig. 4).Additionally, on the lefthand side y axis, the single scattering albedo (estimated for 500 nm) from AERONET measurements is shown as a function of water vapour amount as well.This also demonstrates that the over-and underestimations of the LUT method coincide with SSA range that is under and over the assumed fixed value of 0.92 (shown with red dashed line) respectively.Visibly, the ratio in the right-hand axis of Fig. 6b,   not until SSA is roughly 0.93 instead of 0.92.Presumably, SSA has actually a different wavelength pattern than the one assumed in LUT.

Conclusions
We have used several inverse methods to retrieve aerosol optical depth (AOD) from surface solar radiation (SSR) and water vapour content (WVC) measurements (with corresponding solar zenith angle data) taken in Thessaloniki, Greece.Two traditional (look-up table and non-linear regression) and four machine learning methods (Gaussian process, neural network, random forest and support vector machine) were used to retrieve AOD estimates for the years 2005-2008.
Then we compared the AOD estimates with collocated AOD measurements by Aerosol Robotic Network (AERONET).Our comparisons showed the following.
AOD estimates based on the LUT method agreed better with AERONET than the NR estimates but apart from RF, the machine learning methods produced AOD estimates that were comparable or better than LUT.
NN and SVM methods reproduced good correspondence to AERONET observations for both low and high AODs while the rest of the methods tended to overestimate low AODs and underestimate high AODs.The main reason for the better performance of these machine learning methods was that there were no constraints of the aerosol single scattering albedo (SSA) in the retrieval.In other words, the methods do not need to explicitly make assumptions on the optical aerosol properties of the atmosphere because they seem to be able to indirectly account for the covariation of WVC and SSA.
When compared with AERONET measurements, the best AOD estimates were retrieved with the machine learning algorithms, but only NN and SVM were also able to generalize accurate estimates for large AODs.
The machine learning methods are sensitive to the selection of the training data set and other constraints, and are generally valid only for the range of variables used for their training; thus care needs to be taken when these methods are employed.
These tools have the potential to be used in the retrieval of AOD from SSR measurements to lengthen the time series of AOD.Historical AOD is essential in the estimation of anthropogenic aerosol effects and in the evaluation of AOD retrievals from space-borne instruments before the 1990s.
The intention of comparing different methods was to test their ability in an "out-of-the-box" configuration.With this in mind, methods were not particularly tuned to reach the best possible results.It is very likely that e.g.optimizing the free parameters used in each of the non-linear modelling approaches, their ability to reproduce observed AOD could be further improved.

2. 5
Machine learning methods for AOD retrievals2.5.1 Neural network (NN)Artificial neural networks belong to the family of machine learning methods(McCulloch and Pitts, 1943).As usual in machine learning methods, the aim of an artificial NN is to generate a mathematical model to represent the phenomenon that is examined.The mathematical model of NN structure specifically consists of interconnected neurons with numeric weights.A typical NN model is multilayer perceptron (MLP)(Rosenblatt, 1958), which is used in this study.A MLP network consists of several neuron layers: an input layer, hidden layers and an output layer.The weights and other parameters of the model are tuned or trained with a specific training data set containing input-output pairs of the phenomenon.

Figure 2 .
Figure 2. Differences between predicted and observed (AERONET) AOD for the methods: (a) LUT (look-up table), (b) GP (Gaussian process), (c) NN (neural network) and (d) SVM (support vector machine) with respect to the observed AOD.The crosses indicate the means of each subgroup, the limits of the boxes are 25, 50 and 75 % of the data, and the lines are plotted with 1.5 times the interquartile ranges.

Figure 3 .
Figure3.The same as Fig.2, but the vertical axis indicates the ratio of the predicted to the observed (AERONET) AOD.

Figure 6 .
Figure6.(a) Aerosol optical depth (AOD), water vapour content (WVC) and 1-SSA at 500 nm from the AERONET inversion sky data.(b) SSA at 500 nm, WVC and the LUT's predicted AOD divided with the observational AOD (AERONET), with the red line fixed to SSA (500 nm) = 0.92 (as in LUT).

Table 1 .
Statistical characteristics of observed (AERONET) and predicted AOD by the methods of NR (non-linear regression), LUT (look-up table), NN (neural network), RF (random forest), GP (Gaussian process), SVM (support vector machine) and some of their combinations (averages without weights, e.g.NN, SVM combination is their average result).Correlation coefficient (R 2 ), mean absolute deviation (MAD), median and their ±20 % percentiles between the observed and predicted.Time consumptions with a recent average computer power of the methods for training/estimation in the magnitude of seconds, minutes and hours.The number of observations is 10 684.
Lindfors et al. (2013)ength and SZA.For a more detailed description of the LUT method seeLindfors et al. (2013).

Table 1
each AOD interval of 0.005.Based on the different statistics in Table1, machine learning methods (NN, SVM, GP) produce a good match with AERONET data and they perform equally well or better than the LUT method according to all the metrics.Due to the fact that RF and NR are not able to reaches one