Satellite data assimilation to improve forecasts of volcanic ash concentrations

Data assimilation is a powerful tool that requires available observations to improve model forecast accuracy. Infrared satellite measurements of volcanic ash mass loadings are often used as input observations into the assimilation scheme. However, these satellite-retrieved data are often two-dimensional (2D), and cannot be easily combined with a three-dimensional (3D) volcanic ash model to continuously improve the volcanic ash state in a data assimilation system. By integrating available data including ash mass loadings, cloud top heights and thickness information, we propose a satellite observational operator 5 (SOO) that translates satellite-retrieved 2D volcanic ash mass loadings to 3D concentrations at the top layer of the ash cloud. Ensemble-based data assimilation is used to continuously assimilate the extracted measurements of ash concentrations. The results show that satellite data assimilation can force the volcanic ash state to match the satellite observations, and that it improves the forecast of the ash state. Comparison with highly accurate aircraft in-situ measurements shows that the effective duration of the improved volcanic ash forecasts is about a half day. It is shown that an effective half-day ash forecast significantly improves 10 the quality of the advice given to aviation over continental Europe.


Introduction
It has been known for many years that volcanic ash is dangerous to commercial jet aircraft (Casadevall, 1994). Little is known about the exact level of ash concentrations that becomes dangerous to the jet turbine, and the current regulation states that the highest concentration an aircraft can endure is 4.0 mg m −3 (EASA, 2011;Fu et al., 2015). Until carefully designed engine 15 performance tests are conducted in realistic volcanic ash cloud conditions, a cautious approach to advising commercial jet operations in airspace is recommended. As a consequence, the eruption of the Eyjafjallajökull volcano in Iceland from 14 April to 23 May 2010, caused an unprecedented closure of the European and North Atlantic airspace resulting in a huge global economic loss of up to 5 billion US dollars (Bonadonna et al., 2012). Due to the major impacts on the aviation community, a lot of research has been initiated on how to efficiently reduce these aviation impacts, starting with improving the accuracy of 20 volcanic ash forecasts after eruption onset (Eliasson et al., 2011;Schumann et al., 2011).
For forecasting volcanic ash plumes, many Volcanic Ash Transport and Dispersion Models (VATDM) are worldwide available, e.g., PUFF (Searcy et al., 1998), HYSPLIT (Draxler and Hess, 1998), ATHAM (Oberhuber et al., 1998), NAME (Jones et al., 2007) and LOTOS-EUROS (Fu et al., 2015). Literatures have reported in-depth comparisons between volcanic ash real-time advisories and volcanic ash transport models (Witham et al., 2007;Webley et al., 2012). The meteorological wind fields and estimates of eruption source parameters (ESPs) (Mastin et al., 2009) such as plume height (PH), mass eruption rate (MER), particle size distribution (PSD) and vertical mass distribution (VMD) are necessary as inputs to the VATDM model. A VATDM uses physical parameterizations of particle sources and removal processes that affect the concentrations in a dispersing volcanic plume. Ash particle aggregation process is important for the estimates of the ash concentrations in atmospheres (Durant et al., 5 2010), but this process is not included in most dispersion models due to its computational demand and physical complexity (Folch et al., 2010). Without accurate knowledge of the ash removal rate in atmospheres and the temporal variation of MER at the volcano, it is impossible to provide quantitatively accurate concentration forecasts for the ash plume arriving in an airspace over a long distance (Prata and Prata, 2012;Fu et al., 2016).
For the purpose of improving the forecast accuracy of volcanic ash concentrations, efficient technologies must be employed 10 to compensate the VATDM's inaccuracies. Data assimilation, which refers to the (quasi-) continuous use of the direct measurements to create accurate initial conditions for model runs (Zehner, 2010), is one of the most commonly used approaches for real-time forecasting problems (Evensen, 2003). In each assimilation step, a forecast from the previous model simulation is used as a first guess, then available observations are used to modify this forecast in better agreement with these observations. An important aspect of the assimilation approach is that it reduces the dependency on accurate knowledge of the ESPs-which are 15 generally unknown at the time of an eruption. This is an effective approach where valid real-time volcanic ash measurements are required to guarantee the forecast accuracy (Fu et al., 2015). Fortunately, during volcanic ash transport, different types of scientific measurement campaigns can be performed to collect information including occurrence of the ash plume and the nature. The measurements contained e.g., LIDAR measurements (Flentje et al., 2010), satellite observations (Stohl et al., 2011;Prata and Prata, 2012), aircraft in-situ measurements (Weber et al., 2012;Schumann et al., 2011) and ground-based in-situ 20 measurements (Emeis et al., 2011). However, it should be noted that such measurements are not always available globally.
Satellite measurements are of special interest, because the detection domain is large and the output data is long-time continuous. For example, the Spin Enhanced Visible and Infrared Imager (SEVIRI, on board the Meteosat Second Generation (MSG) platform provides a large view coverage of the atmoshere and earth's surface from 70 • S to 70 • N and 70 • W to 70 • E (Schmetz et al., 2002). There are 3712 × 3712 pixels covering the full-disk. Images can be acquired for the whole disk every 25 15 minutes. These satellite data have been used for many years to retrieve ash mass loadings in a dispersing volcanic plume (Zehner, 2010). Nowadays, ash mass loadings (Prata and Prata, 2012), the effective particle size (Kylling et al., 2015) as well as the ash cloud top height (Francis et al., 2012), are available in near real-time as satellite products during volcanic plume transport. The availability of satellite-based data provides us with an opportunity to employ data assimilation with a VATDM to continuously correct the volcanic ash state, and then improve the forecast accuracy of volcanic ash concentrations.

30
There still exist difficulties on how to efficiently use volcanic ash mass loadings, because a VATDM is in most cases a 3D model, while the satellite-retrieved ash mass loadings are 2D data. One 2D mass loading can be considered as an integral of ash concentrations along a retrieval path (the path can be a line or a curve which depends on a specified retrieval algorithm) (Prata and Prata, 2012). Thus, the 2D measurements are not directly suited in a 3D data assimilation system. Since satellites provides 2D ash mass loadings and the model has 3D concentrations, an observational operator is needed by the data assimilation 35 2 Atmos. Chem. Phys. Discuss., doi:10.5194/acp-2016-436, 2016 Manuscript under review for journal Atmos. Chem. Phys. Published: 14 July 2016 c Author(s) 2016. CC-BY 3.0 License. algorithm, and must be derived to make both types of information directly comparable. For this purpose, vertical information of the ash cloud, such as the ash cloud top height, its thickness and the corresponding uncertainties, should be included. Some satellites can provide very detailed vertical information on plumes, e.g., Cloud-Aerosol Lidar with Orthogonal Polarization (CALIOP) lidar measurements, but are spatially sparse (Winker et al., 2012). To get the thickness information for the entire volcanic ash plume, CALIOP data only is not enough. Based on estimates using extensive airborne lidar observations, the 5 cloud thickness has been investigated and estimated to be 0.5-3 km over space and over time for the entire volcanic ash plume (Marenco et al., 2011;Christopher et al., 2012;Prata et al., 2015).
In order to integrate data and information about volcanic ash clouds, the first goal in this study is to develop a satellite observational operator to translate satellite-retrieved 2D ash mass loadings to 3D concentrations at the top layer of the ash cloud. Secondly, using the extracted in-situ measurements, we investigate whether ensemble-based data assimilation can sig-10 nificantly improve the volcanic ash state. Finally, the effective duration of the improved volcanic ash forecasts after satellite data assimilation is quantified and the impacts on aviation advices are also discussed.

Available data for data assimilation
In this study, geostationary SEVIRI observations for the 2010 Eyjafjallaökull volcanic eruption plume (Prata and Prata, 2012) are used as the study case to design a suitable satellite observational operator (SOO) for data assimilation. SEVIRI is a 12-15 channel spin-stabilized imaging radiometer. Measurements are made from the visible (0.5 µ m) to the infrared (13.4 µ m) with a spatial resolution from 3 km × 3 km at the sub-satellite point to 10 km × 10 km at the edges of the scan. A region covering 30 • W to 15 • E and 45 • N to 70 • is selected here for analysis which includes the geographic area affected by the Eyjafjallaökull volcanic ash (see Fig. 1).
The main retrieval products from SEVIRI are ash mass loadings and the effective particle size for each pixel (Prata and 20 Prata, 2012;Kylling et al., 2015) (see Fig. 1a, Fig. 1b) where 03:15 UTC 16 May 2010 is chosen for the illustration, without loss of generality). The mass loading at each 2D pixel gives information of the ash cloud from the top view, which can be taken as an integration of ash concentrations along the retrieval path (Prata and Prata, 2012). The effective particle size gives information of which ash particles are measured for the mass loadings (Kylling et al., 2015). Together with ash mass loadings and effective particle sizes, other products including the ash cloud top height (Fig. 1c), and the error of ash mass loadings 25 ( Fig. 1d) are also available in a near real-time sense (Francis et al., 2012;Prata and Prata, 2012). As a parameter used in SEVIRI retrievals, the data of ash cloud top height is adopted with the SEVIRI-KNMI product of ash height, which has been evaluated with a reasonable accuracy, as reported by de Laat and van der A (2012). The error of ash mass loadings indicates the uncertainty and accuracy of the retrieved mass loadings (Prata and Prata, 2012). All the data shown in Fig. 1  Limited validation has shown that the satellite ash retrievals are sufficiently accurate for use with dispersion models to correct ash concentration forecasts (Prata and Prata, 2012;Kylling et al., 2015). However, the correction cannot be directly and automatically implemented by data assimilation due to the insufficient vertical resolution in satallite data (Bocquet et al., 2015).
Besides the data from satellites, Marenco et al. (2011) reported airborne lidar observations for the case study of 2010 5 Eyjafjallaökull volcanic eruption. The lidar gives profiles of aerosol extinction, which is suitable for aerosol and thin cloud observations. The lidar equiped in their six research flights is in a nadir-viewing geometry, taking measurements with a vertical resolution of 1.5 m and an integration time of 2 s (Marenco et al., 2011). Based on extensive lidar measurements along all the flights, Marenco et al. (2011) investigated that the cloud thickness can be estimated at 0.5-3 km over space and over time for the entire volcanic ash plume. Although this thickness information is not deterministic, its uncertainty spread is suitable in an 10 observational operator for satellite data assimilation.

Derivation
The derivation of the satellite observational operator (SOO) is shown in Fig. 2. The observed values by SEVIRI for the ash mass loadings (ML) ranged between 0.2 and 5.0 g m −2 (see Fig. 1a where value at 0 means no data). The values can be taken 15 as an integration of ash concentrations along the retrieval path. In principle, the satellite retrieval path could be complicated but generally it is assumed to be a straight line (along the line-of-sight, ignoring refraction) from the measuring apparatus (see (1) To extract ash concentrations from SEVIRI retrievals, ML vert only is not sufficient and knowledge about the vertical distribution of ash cloud must be included. It has been investigated that an ash plume can be considered as vertically thin, almost like a box-car distribution function (Prata and Prata, 2012). The cloud vertical profile can be described with the height of the 25 top and the thickness of the cloud. As introduced in Section 2 and Section 1, the cloud top height (H top ) is available from satellite remote sensing and the thickness of the plume is investigated (T low to T high , i.e., 0.5 to 3 km). Fig. 2b illustrates how the 3D ash concentrations are extracted from the obtained mass loadings in the vertical direction (ML vert ). Only the blue layer in Fig. 2b is with 100% certainty of containing ash, because it is determined by the lowest possible thickness (T low ). Thus the extraction layer used in this study only refers to the blue layer. follows. First we define where T is a step length and N s is the number of the possible thickness. T low represents the blue layer (see Fig. 2b) with the fixed thickness of 0.5 km and T high − T low represents the yellow layer with the fixed thickness of 2.5 km. T is chosen at a 5 small value compared to T low , which guarantees N s is not too small (e.g., less than 2) to sample enough thickness T 1 , T 2 , · · · , T Ns with equal probability. (e.g., T is chosen as 0.05 km in this case study, thus N s is calculated as 50.) Corresponding to the sampled thickness, the ash concentration can be calculated as also a sample from C 1 to C Ns , as shown in Eq.
(3). Therefore, the mean (C mean ) and the standard deviation (C std ) of the sampled ash concentrations can be calculated by Eq. (4) and (5), C mean is therefore used in this study as the extracted concentration C between the heights [H top -T low ] and H top (i.e., the blue layer in Fig. 2b). How much of the mass is distributed to the blue layer (ML blue ) can be calculated by Eq. (6), 15 Note that, below the height [H top -T low ] (the yellow layer shown in Fig. 2b), ash concentrations should not be extracted, because the concentrations there can be zero. For example, when H top equals to 8.0 km and the cloud's thickness is 1.0 km, thus ash concentrations between 7.5 km and 8.0 km can be obtained from Eq. (4). However, the ash concentrations at 5.0 km cannot be extracted because actually there is no concentration at that height. Another note is that Eq. (5) is calculated based on the most commonly used assumption of Gaussian distribution in error analysis. Gaussian often occurs in nature, and by lack of 20 other information this is therefore a suitable first choice. (2) to (5) describe the details of the SOO. The operator transforms the 2D ash mass loadings (ML) to 3D ash concentrations (C, here C=C mean ). Fig. 3a is the extracted ash concentrations (C) at the cloud top layer. It can be seen that the extracted ash concentrations in the entire ash plume are between 0.1 and 1.5 mg m −3 .

25
Now we quantify the extraction error C error (i.e., error in the extracted concentrations), which is important for a data assimilation system. The extraction error is not equivalent to C std , but depends on both the retrieval error ML error (error in mass loadings, as shown in Fig. 1d) and C std . The dependence is described by Eq. (7) in terms of uncertainty, where the uncertainty (U C ) of the extracted concentrations is calculated based on the derivation uncertainty ( C std C , normalized standard deviation) and the retrival uncertainty ( MLerror ML ). Eq. (7) is defined according to the fact that the extraction is performed on the uncertain ash mass loadings, indicating the conditional probability relation. Now U C is quantified, the error (C error ) in the extracted concentrations can be easily obtained by Eq. (8), Fig. 3b is the illustration of the extraction error C error .
C error together with C describes the 3D measurements (mean, error) for ensemble-based data assimilation. The concentrations and the related errors extracted from SOO correspond to the retrieved ash effective particle sizes. Note that during satellite retrievals, only the fine particles can be detected in the tropospheric volcanic plume based on the robust and reliable retrieval algorithms (Prata, 1989;Corradini et al., 2008). For the case study, the retrieved effective particle sizes are around 2.5 -10.0 Note that the observational operator (H, see Appendix A) used in EnSR is different from SOO. SOO is an operator designed as a preprocessing procedure before data assimilation, which doesn't depend on the model space and aims to transfer 2D satellite data into 3D measurements for later usage in EnSR. While, H is an intrinsic operator in the EnSR algorithm as specified in Appendix A.

25
To simulate volcanic ash transport, the LOTOS-EUROS model (Schaap et al., 2008) is used in this study. The configurations and evaluations of the LOTOS-EUROS as a proper volcanic ash transport model has been reported by Fu et al. (2015). The for a successful data assimilation. Here we use uncertainties in plume height (PH). PH is set to be the radar detection data from Icelandic Meteorological Office (IMO) and its uncertainty is estimated to be 20 % (Bonadonna and Costa, 2013). The stochastic plume height (PH) is assumed to be temporally correlated with exponential decay and the correlation parameter τ is set to be 1 hour (Fu et al., 2015). Thus, the PH noise (N ph ) at two times (t 1 and t 2 ) has the relation (Evensen, 2004) of , where E represents the mathematical expectation.

Total measurement error
To assimilate measurements in a simulation model, the total measurement error must be first estimated, which not only contains the extraction error (Section 3.2), but also includes an estimate of the model representation error (Fu et al., 2015). model is probably small. For the moment we will therefore not explicitly specify a model representation error, but implicitly assume that it is zero. Therefore, the total measurement error used in data assimilation, is equal to the extraction error in this study.
After the measurements of concentrations are extracted and the total measurement error is quantified, EnSR can be used to combine them with the LOTOS-EUROS model running to reconstruct optimal estimates. 20

Assimilation performance
In the following, we first examine how data assimilation actually works in the system (see Fig. 4). The first assimilation result with EnSR (Fig. 4a, b), at 01:00 UTC 16 May 2010, is shown against the SEVIRI extracted measurements (Fig. 4c). Ensemblebased data assimilation includes two steps (forecast and analysis, see Appendix A). After one-day of model running started from 00:00 UTC 15 May 2010, the EnSR forecasted state at 01:00 UTC 16 May 2010 is shown in Fig. 4a. Comparing the 25 state to the extracted measurements, the former (with concentrations higher than 2.0 mg m −3 in the main plume) shows a large overestimation compared to the latter (with concentrations lower than 0.8 mg m −3 ). After the EnSR analysis step, the overestimation has almost vanished. The concentrations in large parts are now closer to the extracted measurements (see Fig.   4b). In reality, the overestimation is usually elusive and hard to avoid, which is mainly due to lack of sedimentation processes (Fu et al., 2016). The comparison between the state of analysis and forecast illustrates that the EnSR assimilation process 30 can solve the problem of overestimation. Note that, in this study only PM 10 ash component is considered in the assimilation system, which is consistent with the SEVIRI retrieved ash effective particle size (see Fig. 1b). It is also the main mass fraction that is transported at large distances from the source, since most of the large particles (and therefore mass) is removed quickly from the plume.
The results above were compared in terms of concentrations, not the original mass loadings. To guarantee the assimilation performance, the comparison in concentrations only is not sufficient, because the original data is not concentrations but mass loadings. If SOO is not accurate enough for extracting the concentrations at specified heights, the assimilation results still can 5 approximate well the inaccurate extracted concentrations due to the intrinsic forcing of ensemble-based algorithms. Obviously, the approximation in this case is incorrect. Based on this consideration, original measurements (i.e., SEVIRI ash mass loadings, see Fig. 5a) need to be employed for a further validation. After two-days continuously assimilating SEVIRI measurements of the extracted PM 10 concentrations, the analyzed volcanic ash state at 00:00 UTC 18 May 2010 is shown in in Fig. 5c. The conventional simulation without assimilation is also presented (Fig. 5b), which is currently the commonly used strategy for the 10 simulation of volcanic ash transport (Webley et al., 2012;Fu et al., 2015). It is clear that the mass loadings with EnSR are in a good agreement with the SEVIRI mass loadings, in almost the entire plume. For example, around the Netherlands, the mass loadings from EnSR is accumulated to 3.0 -3.5 g m −2 , which is in good match with SEVIRI retrieved 3.3 g m −2 . While with the conventional simulation, the mass loadings in this area exceed 5.0 g m −2 . It can be seen that EnSR effectively decreases the overestimation level compared to the conventional simulation. Because the measurements used in the assimilation system 15 are extracted with the SOO, thus the good results with respect to mass loadings also verify the suitability of SOO for extracting reliable 3D concentrations. Note that here we also checked the SEVIRI mass loading retrieval error and the standard deviation of the mass loadings, and found that both have the same order of magnitude.
5 Forecasts after satellite data assimilation 5.1 Quantification of the effective duration using aircraft in-situ measurements 20 According to discussions above, the accuracy of volcanic ash state is significantly improved by ensemble-based data assimilation after a continuous assimilation period (e.g., two days). Apparently, with the improved state as initialization, an improved forecast can be obtained (Fu et al., 2015). However, it remains unknown how long the improvement on forecasts will last. In other words, if the improvement on volcanic ash forecasts is only within a short time (e.g., 1 hour) and after this short time period, the forecast deteriorates dramatically, there will be no significance for providing valid aviation advice or guidance 25 based on the assimilation forecast. This is because usually a flight mission lasts at least 2 hours for an intercontinental flight, thus aviation advice needs a safety guarantee for a longer time, for example at least 6 hours.
To investigate the effective duration of the improved ash forecasts after assimilation, a one-day forecast is performed by initializing EnSR analyzed state (Fig. 5b)  For the period from 09:30 to 11:00 UTC (Fig. 6c), although the forecasting time has been over 9 hours (i.e., the last assim-10 ilation is 9 hours ago), the forecasted concentrations still have a good match with the accurate aircraft measurements, while the conventional forecast (i.e., forecast without assimilation) doesn't. This result shows the forecast over 11 hours after assimilation has also a high accuracy compared to the measurements. The result can be extended to 15 hours comparing with the other period from 12:30 to 15:00 UTC (Fig. 6d). Therefore, the validation test with aircraft in-situ measurements shows that forecasts after satellite data assimilation remains valid and accurate for at least 15 hours. This time duration last probably even 15 longer, but we don't have aircraft measurements later than 15 hours available to evaluate this. Considering that this duration is likely to be dependent on the weather dynamics, so in this study we quantify the effective time duration at a shorter length for a conservative estimate, e.g., 12 hours (a half day).

Improved aviation advice
The quantification in the previous section shows that forecasts after satellite data assimilation within a half day can be consid- To understand the details of the impact on aviation advice from satellite data assimilation, without loss of generality, ash 25 forecasts at three time snapshots (01:00, 06:00 and 12:00 on May 18, 2010, left panel in Fig. 7) at a height of 4 km are chosen.
As comparisons, forecasts without assimilation are also shown in Fig. 7 (right panel). In Fig. 7b, d and f, the forecasted ash concentrations in the volcanic plume has reached over 4.0 mg m −3 in large parts of Europe before 12:00 UTC 18 May, 2010.
Areas with such a high ash concentration would have been classified as No Fly Zone (NFZ) (EASA, 2011;Fu et al., 2015).
Thus, only relying on simulation results, the aviation advice on continental Europe would lead to an advice of no flights are 30 allowed. Consequently, this aviation advice would shut down flights in a large area and would cause a huge economic loss.
In contrast, based on the forecasts after satellite data assimilation, the aviation advice before 12:00 UTC 18 May, 2010 would have been completely changed to fly safe in almost the entire Europe, because except in small parts of Northern Italy, ash concentration all over Europe are lower than 3.0 mg m −3 . This illustrates that aviation advice can significantly benefit from the 9 Atmos. Chem. Phys. Discuss., doi:10.5194/acp-2016-436, 2016 Manuscript under review for journal Atmos. Chem. Phys. Published: 14 July 2016 c Author(s) 2016. CC-BY 3.0 License.
improved forecast accuracy after satellite data assimilation. Note that a better model can be used to also lower overestimation (e.g., 4.0 mg m −3 ) as the assimilation does, so that in this case the aviation advice will be improved based on the better model and the assimilation. Another note is that the uncertainties of aviation advice (with assimilation) are different for different regions. The advice near the volcano must be less certain than those in continental Europe, because in the regions near the volcano (mainly Iceland) high ash concentrations can appear due to later sudden emissions. We would also suggest the aviation 5 advices in the Netherlands, Belgium and west Germany are more certain than the other regions of continental Europe where the aviation advice can be influenced by observation limits (as shown in Fig. 5a).

Conclusions
In this study, a satellite observational operator (SOO) was developed to translate 2D satellite ash mass loadings to 3D ash concentrations at the top layer of volcanic ash clouds. To extract ash concentrations, not only the near real-time SEVIRI data of 10 ash mass loadings, effective particle size, ash cloud top height are employed, but also measurements of the ash cloud thickness (0.5-3 km) for the entire plume are combined. The advantage of SOO is that it can use rough thickness information to get uncertain concentrations, which are suitable for the data assimilation methodology.
The extracted ash concentration measurements enable us to perform ensemble-based data assimilation in a 3D volcanic ash transport model. By employing a pre-processing procedure before data assimilation to generate new measurement values 15 by averaging all surrounding measurements, the model representation error is estimated at about zero. The extraction error is also calculated, and the total measurement error (defined as the sum of the extraction error and the model representation error) is therefore quantified, which together with the concentrations describes the 3D measurements (mean, error) for a data assimilation system. The results showed the assimilation significantly reduces the overestimation level of the conventional simulation. The accuracy of the volcanic ash state was shown to be significantly improved by the assimilation of satellite mass 20 loadings. The good assimilation performance also verifies the suitability of the proposed SOO.
With the improved volcanic ash state as initialization, improved volcanic ash forecasts are obtained. Quantification using highly accurate aircraft in-situ measurements showed that the forecasts after satellite data assimilation remain valid and accurate up to a half day. This effective time period probably lasts even longer and it should be validated when more aircraft measurements are available. Our study has demonstrated that aviation can significantly benefit from the improved accuracy of 25 the forecast.
In this paper, we applied an off-line approach for model running and simply used the deterministic meteorological input data. Actually these data also contain uncertainties which have an influence on ash cloud transport. In future work, for more accurate ash forecasting, uncertainties in the meteorological data like wind speed should also be taken into account.

Appendix A: Ensemble Square Root Filter
Ensemble Square Root Filter (EnSR) is essentially a Monte Carlo sequential method (Evensen, 2003), based on the representation of the probability density of the state estimate by an ensemble of N states, ξ 1 , ξ 2 , · · · , ξ N . Each ensemble member is assumed as one sample of a true state distribution. The required ensemble size depends on the model's nonlinearity and the the involved uncertainties. For the application of the filter algorithm to a volcanic ash transport model, an ensemble size of 50 is 5 considered acceptable for maintaining a balance between accuracy and computational cost (Fu et al., 2015(Fu et al., , 2016. In the first step of this algorithm an ensemble of N volcanic ash state ξ a (0) is generated to represent the uncertainty in the initial condition x(0). In the second step (the forecast step), the model propagates the ensemble members from time t k−1 to t k : The state-space operator M k−1 describes the time evolution from the time t k−1 to t k of the state vector which contains the ash 10 concentrations in all the model grid boxes. The filter state at time t k is a stochastic distribution with mean x f and covariance P f given by: 15 The observational network at time t k is defined by the observation operator H that maps state vector x to observation space y by y(k) = H k (x(k))+v(k), where the observation error v is drawn from Gaussian distribution with zero mean and covariance matrix R. Here, y contains the measurements of ash concentrations and R is assumed to be a diagonal matrix with the square of the standard deviation (measurement uncertainty) as diagonal entries. The operator H selects the grid cell in x(k) that corresponds to the observation location. When measurements become available, the ensemble members are updated in the 20 analysis step using the Kalman gain and their ensemble covariance matrix following: where v j represents realizations of the observation error v. To reduce the sampling errors introduced by adding random num-

25
bers v j to the observations, the analysis step can be written in a square root form (Evensen, 2004;Sakov and Oke, 2008a, b).
Using the notations Y = HL f and S = YY , the updated covariance matrix becomes: thus L a can be represented by where T is an N × N matrix which satisfies: TT = I − Y S −1 Y. It can be easily shown that there is a unique symmetric positive definite solution defined as the square root of the symmetric positive definite matrix: 2 . By using the eigenvalue decomposition, the matrix T s has the following form: where T s is referred as the symmetric factor. The symmetric algorithm defined above introduces the smallest analysis increments for an arbitrary compatible norm. The good performance of EnSR has been obtained on improving the forecast accuracies without introducing additional sampling errors (Evensen, 2004;Sakov and Oke, 2008a).  12 Atmos. Chem. Phys. Discuss., doi:10.5194/acp-2016-436, 2016 Manuscript under review for journal Atmos. Chem. Phys. particle size (i.e., ash distribution measured in particle sizes by SEVIRI