Aerosol optical depth assimilation for a size-resolved s c ional model: impacts of observationally constrained, multi-wavelength and fine mode retrievals on regional scale forecasts

Introduction Conclusions References

This discussion paper is/has been under review for the journal Atmospheric Chemistry and Physics (ACP). Please refer to the corresponding final paper in ACP if available.
Aerosol optical depth assimilation for a size-resolved sectional model: impacts of observationally constrained, multi-wavelength and fine mode retrievals on regional scale forecasts P. E. Saide 1 , G. R. Carmichael 1 , Z. Liu 2 , C. S. Schwartz 2 , H. C. Lin 2 , A. M. da Silva 3 , and E. Hyer

Introduction
Atmospheric aerosols play multiple roles including producing acute health impacts, generating visibility issues and creating a substantial climate response (e.g. . Thus, it is important to have accurate estimates of aerosol concentrations in multiple ways: to generate AOD to surface PM 2.5 conversion factors (van Donkelaar et al., 2006;Liu et al., 2005); to improve model daily and monthly estimates of ground level PM 2.5 (Carmichael et al., 2009;Adhikary et al., 2008); to correct model initial conditions to produce improved reanalysis and forecasts (Liu et al., 2011;Benedetti et al., 2009); and to produce better emissions estimates (Huneeus et al., 2012;Heald et al., 2010). Among the satellites and sensors that produce AOD estimates, one of the most commonly used is the operational dark target retrieval from the Moderate Resolution Imaging Spectroradiometer (MODIS) sensor on board the Terra and Aqua platforms (Remer et al., 2005), as it tends to generate accurate observations over a wide range 15 of surfaces (Petrenko and Ichoku, 2013). However, this retrieval often shows deviations from ground measurements, and centers such as the Naval Research Laboratory (NRL) (Zhang et al., 2008) and National Aeronautics and Space Administration (NASA) (GMAO, 2013) use observationally constrained retrievals (where AOD is empirically fitted to ground sun-photometer data) in their assimilation systems. To our knowledge, 20 the impact of assimilating operational MODIS products versus the observationally constrained ones has not been previously assessed.
MODIS AOD, as other satellite/sensor products, is reported in several wavelengths (three and seven for land and ocean retrievals, respectively) and the wavelength dependency of AOD (Angstrom Exponent) contains aerosol size information (Schuster Introduction  Aerosol RObotic NETwork (AERONET) network to constrain a global aerosol model. AOD retrieval algorithms can also produce a fine mode fraction product. A few studies have explored the use of the fine mode fraction and total AOD simultaneously. For example, Generoso et al. (2007) used fine and coarse mode AOD on global data assimilation experiments using POLDER satellite measurements, as well as Huneeus 5 et al. (2012), that used fine and total MODIS AOD in the context of a global emissions inversion with positive impacts including improved aerosol size distributions. There is a need to further assess the impacts of simultaneous use of these data sets in a data assimilation framework.
In this study we develop the ability of the Gridpoint Statistical Interpolation (GSI) 10 Three dimensional variational (3DVAR) system to simultaneously assimilate various AOD products to correct Weather Research and Forecasting/Chemistry (WRF-Chem) forecasts when using the Model for Simulating Aerosol Interactions and Chemistry (MOSAIC) treatment (Zaveri et al., 2008). This aerosol model is widely used in several applications, but its use in an assimilation framework is challenging due to the large 15 number of species and size bins that need to be treated simultaneously (Li et al., 2012). However, assimilation performed over models that have higher degrees of freedom may be useful when assimilating many data sources at the same time, as both the total mass and aerosol size distribution could be modified to produce a better fit to observations. Section 2 describes the method and additions introduced to GSI to effectively 20 perform assimilation with the MOSAIC model. Then, the system is used in two experiments shown in Sect. 3. First, we assess the impact of assimilating operational MODIS retrievals (dark target land and ocean) versus observationally constrained ones, and second, we evaluate the impact on forecasts when simultaneously assimilating multiple wavelengths and fine and total AOD compared to just assimilating total 550 nm AOD. Introduction

Forecast model
The aerosol forecasts were performed with the Weather Research and Forecasting (WRF) version 3.4.1 regional meteorological model (Skamarock et al., 2008) coupled to aerosol and chemistry (WRF-Chem) (Grell et al., 2005). This is a fully coupled on-5 line model. The chemical and aerosol mechanism used is the CBMZ gas-phase chemical mechanism (Zaveri and Peters, 1999;Fast et al., 2006) coupled to the 8-bin sectional MOSAIC (Zaveri et al., 2008) aerosol model. MOSAIC keeps track of 8 chemical species (sulfate, nitrate, ammonium, organic carbon, black carbon, sodium, chloride and other inorganics, where dust is included) on two phases (dry/interstitial and 10 wet/activated), that along with number concentration (on both phases), water and hysteresis water content per size bin results on a total of 160 species tracked. The model configuration is based on Saide et al. (2012b). Some of the configuration choices include a 12 km horizontal grid spacing with 72 vertical levels with ∼ 60 m level thickness below 3 km, MYNN level 2.5 planetary boundary layer (PBL) scheme (Nakanishi and 15 Niino, 2004), Lin microphysics (Chapman et al., 2009), Goddard short wave radiation (Chou et al., 1998;Fast et al., 2006), and Abdul-Razzak and Ghan (2002) (Guenther et al., 2006), FINN biomass burning emissions (Wiedinmyer et al., 2011) coupled to an online plume-rise model (Grell et al., 2011), Gong et al. (1997 sea salt parameterization and GOCART dust scheme (Zhao et al., 2010). Meteorological and chemical boundary conditions are obtained from National Centers for Environmental Prediction (NCEP) Final Analysis (http://rda.ucar.edu/datasets/ds083.2/) and MOZART 25 forecasts (Emmons et al., 2010), respectively. Even though WRF-Chem has the option to include secondary organic aerosol (SOA) formation coupled to MOSAIC aerosols (Shrivastava et al., 2011;Hodzic and Jimenez, 2011) this analysis, as the focus of this paper is the development and testing of the new assimilation system.

Assimilation system
We use the Gridpoint Statistical Interpolation (GSI) assimilation system (Wu et al., 2002;Kleist et al., 2009). The GSI version used is based on the modifications made 5 by Liu et al. (2011) and Schwartz et al. (2012) to assimilate AOD. However, we incorporated substantial additional modifications suited to the MOSAIC aerosol model as described in the following sub-sections.

3DVAR method
In this study we build upon work of Liu et al. (2011) and Schwartz et al. (2012). As they 10 presented, we use the 3DVAR functional (J), but add terms to allow weak constraints: 15 Where y represents the observation, x the control variable (e.g. the one modified during optimization), with x b the a priori estimate and x uc and x lc the upper and lower constraints, H the observation operator, R, B and K the observation, background and weak constraint error covariance matrixes, and k uc and k lc regularization parameters to weight the weak constraint. Introduction

Tables Figures
Back Close

Full Screen / Esc
Printer-friendly Version

Interactive Discussion
Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | of using as control variables the natural logarithm (LN) of 3-D aerosol concentrations. This choice naturally constrains concentrations to be positive and provides multiplicative rather than additive corrections (Henze et al., 2009). In the same manner, we add the option to use the AOD natural logarithm as the observation. When using these logarithmic choices, the sensitivities from the observation operator have to be computed 5 accordingly, which is achieved by using the non-log sensitivities and the chain rule of derivatives: where c represents the aerosol concentrations being analyzed. By using this conversion we are able to use the same code to compute sensitivities for any choice of control 10 variable. To avoid zero values, we set a threshold of 1×10 −20 for AOD and aerosol concentrations when converting to and from the LN variables. An additional modification with respect to the control variable used in previous research is that, instead of using aerosol concentrations output by WRF-Chem (µg kg −1 ), we multiply them by the grid-cell vertical thickness (in meters), which provides a mea- 15 sure of the column concentration and is proportional to aerosol mass rather than aerosol concentration. The consequences of not applying this correction are depicted by the following example. For two given grid-cells in the same column and containing the same aerosol concentrations and uncertainty, the grid with the deeper thickness will contain higher sensitivities, as the same change in concentration will generate a higher 20 increase in AOD due to the deeper layer. By multiplying by the thickness, we avoid the assimilation favoring changes in deeper grid-cells, which could be important in configurations with great vertical variability as the one used in this study.
Finally, instead of using as control variables all aerosol species (Liu et al., 2011;Schwartz et al., 2012) on all size bins, we introduce the option of using total mass per 25 size bin as control variables, and distribute the changes within GSI considering the percentage of mass contribution of each species as a constant for each size bin. This consideration allows a reduction in the number of control variables by a factor equal to 12220 Introduction

Tables Figures
Back Close

Full Screen / Esc
Printer-friendly Version

Interactive Discussion
Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | the number of species, which is eight in our case. When using this choice the system is faster, has fewer degrees of freedom and is less likely to accumulate changes on single species. Similar assumptions have been made in other AOD assimilation systems (e.g. Benedetti et al., 2009) with the difference that here we can still produce changes in the total aerosol composition, as different species often dominate different size bins (Saide 5 et al., 2012a). We use the total mass per size bin as control variable for all experiments presented in this study. The weak constraint term added in Eq.
(1) constrains the control variable so the optimal solution would be within user specified bounds or close to them. The implementation is based on the relative humidity weak constrain done in GSI meteorological 10 assimilation. K is diagonal and chosen as a scale (in the variance space) of the control variable. For simplicity we chose it equal to the diagonal of B and use the parameters k uc and k lc for weighting the constraint. x uc and x lc represent the desired bounds for the control variable and are calculated as multiplicative factors applied to the prior (additive in the case of LN control variable). In the experiments, x uc and x lc were chosen 15 equal to 5 · x b and 0.01 · x b with k uc and k lc equal to 0.5 and 0.05. Higher weight and smaller multiplicative bound are given for the upper constraint as we found that overly increasing concentrations (i.e. incorrectly high AOD retrieval) can excessively damage the forecasts. 20 By using standard deviations and vertical and horizontal correlation length scales as inputs, GSI is able to approximate the convolution of a background error covariance matrix (B) by the use of recursive filters (Wu et al., 2002;Purser et al., 2003). Besides vertical and horizontal correlations, chemical and aerosol data assimilation often incorporates the use of cross-species correlation, as many of these are co-emitted or 25 have similar precursors (Elbern et al., 2007). Since we use total mass of all species per size bin as control variable, inter-species correlation is not applicable. There is also a natural correlation for different size bins for each species that needs to be considered 12221 Introduction

Tables Figures
Back Close

Full Screen / Esc
Printer-friendly Version

Interactive Discussion
Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | (e.g. Saide et al., 2012a). By using recursive filters we incorporate the capacity to add correlations between aerosol size bins in GSI. Filter passes run along size bins in incremental order and are applied locally for each aerosol size distribution, in a similar way as vertical scales are applied (Wu et al., 2002). As in Liu et al. (2011) and Schwartz et al. (2012), we use the NMC method (Parrish 5 and Derber, 1992) for computing the standard deviations and vertical and horizontal length scales. Depending on the choice of control variable (see Sect. 2.2.1), the same variable has to be the input to the NMC computation. For the case of LN control variables, we constrain the standard deviation to be less than or equal to one LN unit to avoid a very unconstrained system. The NMC method generally uses two forecasts (12 10 and 24 h or 24 and 48 h) to compute statistics. We use a long meteorological spin-up time (Saide et al., 2012b), so following this strategy would consume too much computational resources. Instead, we assess uncertainties by running two continuous parallel simulations driven by different meteorology. In the case of retrospective North American experiments, this can be done using NCEP final analysis and North American Regional 15 Reanalysis (NARR). For simplicity, the inter size bin correlation length are specified in the namelist by the user and not computed through the NMC method. However, we do not discard this possibility for future studies. The size bin correlation length scale was chosen equal to 2 bin units, which prevents excessive accumulation of innovations on a single size bin and distributes the changes along them. The NMC method used as 20 described above for May 2010 yields isotropic horizontal length scales between 15 and 36 km, with smaller and higher values in the lower and upper troposphere, respectively. These are considered small values compared to global data assimilation systems, but are in the range of 1 to 3 times the horizontal grid resolution, which falls between typical ranges (Liu et al., 2011). Vertical length scales vary between 1 and 6 model grid verti-Introduction

Forward and adjoint of the observation operator
While Liu et al. (2011) and Schwartz et al. (2012) used Community Radiative Transfer Model (CRTM) (Han et al., 2006) as forward and adjoint observation operator, here we use WRF-Chem optical properties (OP) routines (Fast et al., 2006). This choice provides consistency between the AOD computed for assimilation and forecast models.

5
The WRF-Chem OP code considers an internal mixture within each aerosol size bin and uses Mie theory along with Chebyshev expansion coefficients for reducing computational time (Fast et al., 2006). This code has shown skill in predicting optical properties against total column data for several regions and aerosol regimes (Yang et al., 2011;Zhang et al., 2010;Chapman et al., 2009;Qian et al., 2010;Zhao et al., 2010;Zhao et al., 2012;Kalenderski et al., 2013) and against in-situ data (Barnard et al., 2010;Shrivastava et al., 2013). The adjoint of this code was obtained using the automatic differentiation tool TAPENADE v 3.6 (Hascoët and Pascual, 2004) sucssesfully passing tangent linear and adjoint tests. We update aerosol water and number within the WRF-Chem OP code added to GSI, 15 so they will be dependent on aerosol concentrations. The water uptake code is extracted from MOSAIC, which uses the activity coefficients of the electrolytes present and the Zdanovskii-Stokes-Robinson method (Zdanovskii, 1948;Stokes and Robinson, 1966). A threshold of 99 % relative humidity was set for water uptake calculations and columns with clouds present are excluded from assimilation. Aerosol number is 20 computed using aerosol concentration and diameter in each bin, assuming that the assimilation does not update diameter using the one in the prior. Another addition to the WRF-Chem OP code added to GSI was the column AOD computation for specific MODIS wavelengths. WRF-Chem computes OP for four wavelengths: 300, 400, 600 and 999 nm. Similarly to WRF-Chem radiative transfer calcu- 25 lations (Fast et al., 2006), interpolation/extrapolation to MODIS wavelengths is done using the Angstrom exponent from the two closest wavelengths. No modifications are needed when computing fine mode AOD, as the coarse bin mass and number are ze-Introduction

Tables Figures
Back Close

Full Screen / Esc
Printer-friendly Version

Interactive Discussion
Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | roed out before AOD and sensitivity computations. As the aerosol models in the MODIS algorithm use a modal approach (Remer et al., 2005) while MOSAIC uses a sectional approach, it is hard to create a complete match between the two when computing the fine fraction. For simplicity, we consider fine mode as aerosols with a dry diameter equal or less than 625 nm (first 2 and 4 size bins for the 4 and 8 bins MOSAIC, respectively), which is in agreement with the cut-off diameter of 600 nm used in the standard AERONET retrieval (Dubovik and King, 2000;Dubovik et al., 2000).

Observations and their errors
The observational data sets that were assimilated in the different experiments are described in the following. There was no thinning of the data to maximize data usage.

Operational MODIS level 2 retrieval
Collection 5.1 MODIS aerosol data from Aqua and Terra satellites were obtained from NASA Goddard Space Flight Center. The dark target retrieval, which is the one used, is based on Remer et al. (2005) and Levy et al. (2007). Over land, AOD (the "Cor-rected_Optical_Depth_Land" product) is provided in three wavelengths: 470, 550 and 15 660 nm. However, for AOD at 550 nm lower than 0.2, the angstrom exponent used to compute the other two wavelengths is fixed (Levy et al., 2007) not providing an independent measurement of size distribution. Most AOD values over land were lower than 0.2 for the period of study, thus only the 550 nm retrieval was used in the assimilation. Over ocean AOD (the "Effective_Optical_Depth_Average_Ocean" product) is provided 20 in seven wavelengths (470,550,660,870,1240,1630 and 2130 nm) but only the ones in the range 550-1240 nm are used in the assimilation to keep the wavelengths used close to the range computed by WRF-Chem. The 470 nm wavelength is not used as there is no validation presented for this wavelength over ocean (Remer et al., 2005). The MODIS aerosol dataset also provides fine mode fraction, defined as fraction that ACPD 13,2013 Aerosol optical depth assimilation for a size-resolved sectional model the fine mode (effective radius less than 0.5 µm) (Kaufman et al., 1997) contributes to the total optical thickness, which can be used to compute fine mode AOD. When operational MODIS data are assimilated, the data are quality controlled to avoid degrading the assimilation. These controls include accepting the highest quality flag (qf = 3) over land and any flag (qf = 1, 2 or 3) over ocean and processing only 5 pixels with zero cloud fraction.

NASA Neural Network Retrieval
The NASA Neural Network Retrieval (NNR) is an observationally constrained retrieval designed to generate a better fit with respect to AERONET observations, and is used operationally in the GEOS-5 (Rienecker et al., 2008) aerosol assimilation system 10 (GMAO, 2013). It uses a neural network as an alternative to linear regression to capture possible non-linear relationships. Predictors used for the ocean retrieval include level 2 multi-channel top of the atmosphere (TOA) reflectances, glint, solar and sensor angles, cloud fraction (only when it is lower than 85 %, otherwise pixel is discarded) and GEOS-5 surface wind speeds. Predictors used for the Land retrievals are TOA re- 15 flectances, solar and sensor angles, cloud fraction (< 85 %) and climatological albedo (only if lower than 0.25). An important difference with other post-processing techniques is that it does not use any MODIS AOD retrieval as a predictor. The target used in the neural network (and in the GEOS5 assimilation system) is not directly AERONET AOD, but log(AOD + 0.01), which tends to better represent a Gaussian probability distribu-20 tion. The AOD at 550 nm is available at the same 10 km resolution of the MODIS level 2 operational retrievals (GMAO, 2013).

Naval Research Laboratory (NRL) -University of North Dakota (UND) retrieval
The NRL-UND retrieval is a value-added AOD dataset based on MODIS Level 2 25 aerosol products specifically designed for quantitative applications including data as- similation and model validation. The quality assurance procedures and empirical correction algorithms (to better fit AERONET data) applied to this product are described in Zhang and Reid (2006), Zhang et al. (2008), Shi et al. (2011) and Hyer et al. (2011). This 550 nm AOD retrieval is derived from MODIS collection 5. A product gridded to 0.5 degree is produced by NASA's Land, Atmosphere Near-real-time Capability for EOS 5 (LANCE) with product code MCDAODHD. Due to the high resolution used in this study, the source code of this algorithm was modified to output results on a 0.05 degree grid, with a minimum of one retrieval per grid and without checking for neighbors on the output grid (no "grid buddy checking"). This method always produces a maximum of one retrieval per gridcell (as MODIS minimum grid size is ∼ 10 km) with no aggregation, 10 being comparable in terms of possible pixels generated to the other two retrievals used (MODIS and NASA NNR). In addition, only pixels with cloud fractions equal to zero and with the highest context quality checking were processed.

Observation error
Observational errors were assumed to be the same for all data sets, even though 15 uncertainty is usually provided for the different data sets (e.g. Shi et al., 2011). This assumption was made to provide the same basis for comparing results. AOD errors over land and ocean were assumed to be equal to 0.6 and 0.2 in LN units (∼ 60 % and ∼ 20 % error, respectively). Our approach does not follow the error estimates proposed by Remer et al. (2005) and used by Liu et al. (2011) and Schwartz et al. (2012) 20 (error = a + b · AOD, with a and b constants function of the type of retrieval) as in this treatment relative errors (computed as a percent of the AOD magnitude) increase as AOD is lower. In the case of operational MODIS data assimilation and when computing errors with this approach, spurious high AOD can significantly damage assimilation results as the high AOD will dominate due to the high relative error of the surround- NNR and NRL-UND). We also found that the fixed log-space uncertainty estimates resulted in better analysis results. These improvements suggests that the uncertainty estimates used in previous research may be too high for low AOD values, or may not correctly account for reduction of random error by spatial averaging in the data assimilation system. 5

Study domain and experimental design
The study region is California and its surroundings, which is an area with important air pollution problems, affected by both local and distant sources (Huang et al., 2010). This region has been the target of several recent measurement campaigns such as ARCTAS-CARB (Jacob et al., 2010) and CALNEX/CARES (Zaveri et al., 2012). The coast of California is also important since in this area a persistent stratocumulus deck is found, which means that (1) aerosol retrieval from satellite is more challenging compared with more cloud-free areas, and (2) aerosol-cloud interaction is likely to be important (Hegg et al., 2012;Twohy et al., 2005). The region also represents a challenge in terms of accurate meteorological and air quality predictions (Yver et al., 2013;Fast 15 et al., 2012). The existence of the stratocumulus deck plus the pollution issues makes this area a good place to demonstrate the application of AOD assimilation approaches and asses its limitations. The modeling domain is centered on the central California coast, with a domain spanning from 30 • N to 47 • N and from 133 • W to 112 • W. A large portion of the domain 20 covers the ocean to allow a higher influence of data assimilated here and to better resolve the stratocumulus deck (Saide et al., 2012b). As previously mentioned, 12 km horizontal grid spacing is used.
Results are presented for May 2010. Simulation without data assimilation (from now on referred as "non-assimilated") start on 26 April to allow for model spin-up and run 25 continuously until the end of May. On the assimilation experiments, analysis steps are performed every three hours with a three hour observation window, then forecasts are restarted from meteorology of the previous forecast and run for three hours. The 12227 Introduction

Tables Figures
Back Close

Full Screen / Esc
Printer-friendly Version

Interactive Discussion
Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | 550 nm operational MODIS AOD retrieval assimilation is considered as the "control" for all experiments, and impacts of other or additional data is assessed. First, we evaluate the impact of assimilating observationally constrained retrievals (i.e. NASA NNR and NRL-UND) and, second, we assess the inclusion of fine mode AOD and multiplewavelengths to the assimilation. We evaluate impacts for PM 2.5 , AOD and Angstrom 5 Exponent (AE). Fractional error and fractional bias (Morris et al., 2005) are computed to asses model performance against non-assimilated observations (see next paragraph). Fractional error reductions (FER) are computed subtracting fractional errors of the experiments and control assimilations. For the second set of experiments, as we assimilate multi-wavelength AOD only over ocean and fine fraction is very infrequent over 10 land for this area and period, we focus our performance analysis on satellite data over ocean and coastal stations.
Observations from different ground monitoring networks were used as independent data to evaluate the data assimilation impacts (Fig. 1). Hourly PM 2.5 data was obtained from US Environmental Protection Agency (EPA) Air Quality System (AQS, 15 http://www.epa.gov/ttn/airs/airsaqs/), which provides a high density of measurements over California with most sites located in urban or sub-urban areas (Pagowski and Grell, 2012). We also used data from the Interagency Monitoring of Protected Visual Environments (IMPROVE, http://vista.cira.colostate.edu/improve/, Malm et al., 1994) network, which collects measurements mainly on remote regions (parks and wilder-20 ness areas), which are representative of one day and collected every 3 days. Besides total PM 2.5 , it also collects aerosol chemical composition measurements, from which we use sulfate, nitrate, chloride, sodium, organic carbon and black carbon. Additionally, total column AOD and Angstrom Exponent (AE) measurements were obtained from AERONET network data (Holben et al., 2001). For the period of study, 10 AERONET 25 stations had data available within the study domain (Fig. 1). Finally, AOD retrievals not yet assimilated are considered as independent data and compared against model forecasts. Even though we perform assimilation every 3 h, most of the data is available in the 18:00 and 21:00 UTC cycles (due to the satellite overpass time and domain of ACPD 13,2013 Aerosol optical depth assimilation for a size-resolved sectional model Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | study), with Terra and Aqua data accumulated mainly in the 18:00 and 21:00 UTC cycles respectively. Thus, by comparing model forecasts and Aqua data one can analyze the performance of assimilation against independent satellite data for a 3 h forecast, or a 21 h forecast by comparing to Terra data. The validation using the three types of observations (aerosol concentration, ground 5 AOD and satellite AOD) represent different levels of independence. The comparison against aerosol concentration observations is the true independent validation as these are not assimilated nor used for obtaining the retrievals assimilated. Comparing against AERONET AOD represents an intermediate level of independence, as even though these observations are not assimilated and have a lower level of uncertainty, they are 10 used to tune the algorithms which compute the assimilated retrievals. Finally, validating against satellite retrievals represents the lowest level of independence, as, even when observations from a different satellite not assimilated are compared to the forecasts, they are computed with the same algorithms as the one assimilated so they retain the same systematic biases, which are propagated into the analysis and forecast. These 15 levels of independence must be considered when analyzing the assimilation tests performance.
Model to observation mapping is described as follows. WRF-Chem output is saved hourly and mapped to ground stations using nearest neighbor interpolation. The hourly PM 2.5 WRF-Chem concentrations are used directly to compare against AQS obser-20 vations. For IMPROVE stations, local time daily averages are computed. AERONET observations are averaged to hourly values which are then compared to hourly WRF-Chem output using the Angstrom Exponent for interpolation to AERONET wavelengths. Finally, both satellite retrievals and model fields are re-gridded to a fixed regular 0.2 × 0.2 degree grid where averages and performance statistics are computed.  Figures 2 and 3 show non-assimilated model performance with respect to PM 2.5 ground observations from the AQS network. In general, the model overestimates PM 2.5 concentrations at most sites, with a global mean of 8.5 µg m −3 and 14.1 µg m −3 for obser-5 vation and model, respectively. As seen in Fig. 2c and d, model biases tend to be more negative over northern California with biases close to zero and smaller errors in the Los Angeles area. Despite the biases, the model is able to reproduce the patterns of highest concentrations in the urban centers (Fig. 2) and captures the synoptic features which generate the high and low particle concentrations in the region (Fig. 3).

Non-assimilated model and retrievals evaluation
10 Figure 4 shows the non-assimilated model evaluation using the IMPROVE speciated observations. The model also overestimates PM 2.5 at these sites. These high model values come from the "Other" aerosol ( Fig. 4), which corresponds mainly (96 %) to the "other inorganics" (oin) specie in MOSAIC. This overestimation can be traced back to dust aerosol in the chemical boundary conditions. The model also shows overesti-15 mation of aerosol nitrate, sea-salt and black carbon. Sea-salt aerosol overestimation is consistent with previous work (Saide et al., 2012b), which is produced by too high sea-salt emissions. The nitrate overestimation may be due to emissions, as the NO x NEI2005 emissions have been found to be overestimated (Kim et al., 2009) and they do not reflect the decreasing trend in NO x emissions up to year 2010 (EPA, 2013). 20 Opposite to the general trend, organic carbon is highly underestimated by the model, which is expected as no SOA scheme was included in the simulations. This difference is more evident as the IMPROVE network consists mostly of remote stations, leaving longer time for SOA production. Sulfate is slightly underestimated, which reflects that SO2 emissions may be low in NEI2005. This could be the result of the NEI emissions  . Significant differences can be seen between the different retrievals. The NASA NNR and NRL-UND retrievals tend to make corrections of the same sign with respect to the operational MODIS, for example they increase AOD over coastal California, while decreasing AOD over the ocean and on the more inland territories, especially over 5 Nevada. Both of these post processing retrievals are calibrated with the AERONET data, so this behavior is expected. However, as the algorithms and inputs are different, there are still some significant differences between both data sets. When looking at specific AERONET sites (Fig. 6) we can see that the post processed techniques are usually closer to the AERONET values than the operational MODIS retrieval. However, 10 there are still persistent biases that the post-processed techniques are not able to overcome, mainly in the area of Nevada and South-East California (Fig. 6f), which is also shown by the high AOD in the monthly means (Fig. 5) in all retrievals. Non-assimilated model monthly mean values (Fig. 5d) show a persistent overestimation in AOD over the ocean and over land for most of the domain. As mentioned 15 above there appears to be a high bias in the boundary conditions associated with dust, which produces a high background AOD over the modeling domain. The nonassimilated model underestimates AOD in Nevada and South-East California, which corresponds to the area mentioned before where the retrievals show higher deviations from AERONET sites. A general overestimation is also found when comparing the 20 non-assimilated model to AERONET stations (Fig. 6), so we anticipate that assimilation should move the aerosol state towards the AERONET observations for most sites and retrievals. An interesting station to analyze is the Caltech Site (Fig. 6e), located in northern Los Angeles. Here, the model shows very small bias for the high AERONET AOD values, which is consistent with small errors and almost no bias found in the PM 2.5 25 AQS comparison (Fig. 2c and d). For this site, satellite retrievals do not exactly match the AERONET data, so we anticipate that assimilation will tend to degrade results in this area as errors in the retrieval are higher than model errors.

MODIS and observationally constrained 550 nm AOD assimilation
Model AOD after assimilation is shown in Fig. 5e and f. Model estimates are closer to the observations being assimilated compared to the non-assimilated one, showing that the optimization process is working properly. Figures 3, 4 and 7 show improvements in PM 2.5 after AOD assimilation. When look-5 ing at the time series of PM 2.5 for the whole month (Fig. 3) it is seen that the bias reduction of the assimilation changes from day to day. These variations can be partially explained by the amount of data being assimilated (Fig. 8) which is a function of several factors including the scan pattern of the MODIS sensor, the quality control applied (the most important being the cloud fraction threshold) and post processing algorithms. For 10 instance, the first 8 days of the month show the consecutive period with the most data available to assimilate and the largest bias reductions. One factor contributing to the correlation between the amount of assimilated data and the resulting bias reduction is the small horizontal length scale used (see Sect. 2.2.2), which prevents corrections during assimilation extending too far from the observation location. Thus, more data 15 will translate into a larger spatial coverage. Results show that all assimilated retrievals reduce the fractional error (from 0.71 on the background to 0.65, 0.62 and 0.64 for MODIS, NASA NNR and NRL-UND assimilation) on a large fraction of stations (85 %, 92 % and 96 % for MODIS, NASA NNR and NRL-UND assimilation). Fractional error reductions (Fig. 7) tend to be higher in 20 stations that originally had higher errors (Fig. 2), like locations in northern and central California. In general, the assimilation of post-processed data (NASA NNR and NRL-UND) has better performance than the operational MODIS data. A very clear example is South Nevada, where MODIS assimilation degrades results, which is in agreement with Schwartz et al. (2012), while both post-processed techniques reduce the errors. 25 As seen in Fig. 8, the NASA NNR retrieval has the highest amount of data assimilated, which is mainly due to the less restrictive quality control applied in this algorithm (e.g. cloud fraction less than 0.85 versus no cloud fraction in MODIS and NRL-UND tests). As discussed previously, having more data tends to improve assimilation performance, thus this is a factor influencing the higher error reductions of assimilating NASA NNR versus the other two retrievals. However, quality of the data is also important, which is why the post-processed techniques show a considerably higher fraction of stations with reduced errors compared to MODIS. In this dimension, the NRL-UND product is 5 the one that shows the highest fraction of stations improved due to the more restrictive quality control applied. The assimilations tend to slightly reduce or even increase errors in the region surrounding Los Angeles (Fig. 7). As mentioned in Sect. 3.1, this was expected as the non-assimilated model has small error and bias in this region, both against PM 2.5 and AERONET measurements. Also as this is a populated area, spatial and temporal concentration gradients are not completely resolved by the 12 km horizontal grid spacing used. This can be observed by comparing observation and model mean maps (Fig. 2a and b), and by the great variability during the day at the AERONET Caltech site not entirely captured in the non-assimilated model (Fig. 6e). The bias reduction against monitored PM 2.5 can also be seen at the IMPROVE sta-15 tions (Fig. 4, bottom), with improvements in the fractional error (from 1.1 to 0.88 globally) for 100 % of the stations analyzed when NASA NNR is assimilated (similar for the other two retrievals). The assimilation does not significantly change aerosol composition as only total AOD is assimilated and because a relatively long correlation length is used between size bins. Thus, the general trend is that aerosol species in the non-20 assimilated model that have a bias of the same sign of total PM 2.5 bias will have their biases reduced in the analysis, while the bias will be increased in the opposite case. For instance, assimilated black carbon and nitrate improved while sulfate and organic carbon degraded after the assimilation tests. This behavior is similar to the one found in experiments assimilating PM 2.5 observations (Pagowski and Grell, 2012).

ACPD
25 Figure 9 shows performance evaluation for a 3 and 21 h forecast against not yet assimilated satellite data. For the 3 h one, assimilation shows error reductions in most of the domain, except in the Nevada and southeast California regions, where the retrievals tend to present issues (see Sect. 3.1). Error reductions tend to be higher over ocean Introduction

Tables Figures
Back Close

Full Screen / Esc
Printer-friendly Version

Interactive Discussion
Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | due to the smaller error assigned to these observations during assimilation, which allow the system to better fit the model to observations. Different errors reductions over different ocean areas are related to cloud presence from day to day, with areas with higher cloud fractions showing less error reductions. The 21 h forecast (Fig. 9b) shows smaller but still significant error reductions as model errors over time bring the state 5 closer to the non-assimilated model. Fractional error reductions are close to zero over the ocean from 125 • W to the west as after 21 h the assimilated aerosol have already been advected away from this region, matching the error of the non-assimilated model. Over the ocean south of California there is still the presence of the assimilated aerosol and the fractional error reductions are considerable (above 0.1). Over California, error reductions are generally less than 0.1 but positive, showing that after 21 h there is still persistence of the innovation. This is also seen in the AQS PM 2.5 comparison, where assimilation almost never goes back to the non-assimilated model levels (Fig. 3) and fractional error reduction at 18:00 UTC for all stations and days is equal to 0.06. High error reductions over land after 21 h (Fig. 9b) are found along coastal southern 15 California and northern Mexico, which is consistent with the excellent and long-lasting performance of NASA NNR (and the other retrievals) assimilation against AERONET measurements at the La Jolla site (Fig. 6a). On the other hand, sites like Trinidad head (Fig. 6b) tend to approach the non-assimilated model more rapidly as the domain boundary is close to the site and boundary conditions blow in this direction. As NASA 20 NNR (and NRL-UND as well, not shown) data are closer to AERONET AOD, assimilation of this data sets generally provides a closer agreement to these observations compared to the operational MODIS data assimilation (Fig. 6)

Multiple wavelength and fine mode AOD assimilation
Compared to AQS and IMPROVE data, fine and total and multiple wavelength assimilations do not degrade results compared to the control 550 nm AOD assimilation, obtaining similar statistics to the ones shown in Sect. 3.2, even when results are filtered for coastal stations.
5 Figure 10 shows error reductions for a 3 h forecast. We see that all approaches considerable reduce errors over the ocean for both wavelengths (Fig. 10 top and middle rows). Assimilating only 550 nm AOD (control) reduces the aerosol loads, thus also reducing the 870 nm AOD generating a better fit with these observations without assimilating them. Assimilating fine and total AOD generates smaller error reductions 10 (for both 550 nm and 870 nm AOD) compared to only assimilating total AOD. This is probably because the additional constrain to the fine aerosol reduces the ability of the optimization to generate a closer fit to the total AOD. Another factor that could also create these results and that has been noted to generate issues (Kleidman et al., 2005) is the possible mismatch in the fine and coarse mode definitions, due to differ-15 ent aerosol approaches used in MOSAIC and the MODIS algorithm (sectional versus modal, respectively). On the other hand and opposite to fine AOD assimilation, using multi-wavelength AOD data generates slightly better error reductions for 550 nm AOD while considerably better reductions for 870 nm AOD when compared to the control assimilation. The better fit to 870 nm AOD observations is expected as the 870 nm 20 retrieval is directly being assimilated. Figure 10 (bottom row) also shows error reductions for the Angstrom Exponent (AE). In general, increasing values on AE indicate finer aerosols (Schuster et al., 2006). Over the ocean, the non-assimilated model tends to show very low AE compared to the observed values (not shown) which is consistent with the overestimation of dust 25 coming from the boundaries (see Sect. 3.1). Even though the 550 nm AOD only assimilation generates a good fit to AOD observations, there is only a small change in AE, with regions where there is even an increase in the error. As this assimilation only uses one observation per column and a large correlation length between bin sizes, the assimilation tends to uniformly modify aerosols within bin sizes, not significantly changing the size distribution and thus the AE. A completely different picture is seen for fine and total, and multi-wavelength AOD assimilations, where the use of multiple observations per column modifies the AE and in the right direction, reducing the errors 5 in most of the domain. The fine and total AOD assimilation tends to generate slightly better AE results than the multi-wavelength AOD assimilation as the former directly modifies the fine aerosol. However, we recommend the use of the multi-wavelength over the fine and total AOD data as total AOD burdens are much better estimated by the multi-wavelength approach, as described in the previous paragraph.

10
To better understand the differences between the control assimilation (550 nm AOD only) versus adding additional multi-wavelength data, Fig. 11 shows vertical profiles of PM 2.5 and aerosol number concentration 3 h after a given assimilation. Even though the PM 2.5 column is reduced for both assimilations (Fig. 11a), the use of multiple wavelength data selectively reduces PM 2.5 in different model layers (higher reductions in the 15 3-8 km layer, smaller below 2 km) to better fit all observations simultaneously. This can generate a shift in the AE as different size distributions are found at different heights. On the other hand, the different assimilation approaches generate opposite results for aerosol number concentrations (Fig. 11b), with the single and multiple wavelength cases reducing and increasing it below 5 km, respectively. As explained in the previous 20 paragraph, when assimilating 550 nm AOD only, the long correlation length generates uniform modifications along bin sizes, so as the total aerosol concentration is reduced, aerosol in the small bin sizes (where aerosol number dominates) will also be reduced, not changing the overall size distribution (Fig. 11c). Again, multiple-wavelength AOD assimilation will selectively modify size bins to create a better fit to observations at all 25 wavelengths, even if changes go in opposite directions between bin sizes. In the case shown, coarse and fine size bins are reducing and increasing its mass respectively, which globally reduces mass (Fig. 11a) but increases number (Fig. 11b), changing the size distribution (Fig. 11c) directions can have a great impact in this region, as these aerosols can act as cloud condensation nuclei and substantially modify cloud properties (Saide et al., 2012b).
Comparisons over coastal AERONET stations show that for periods when the singleand multi-wavelength assimilations have differences (flow towards the coast), assimilation of multi-spectral AOD tends to show slightly better performance against 870 nm 5 AOD (Fig. 12, left column), but the single wavelength assimilation still shows very good skill as mentioned previously. Stronger differences can be appreciated in the AE time series (Fig. 12, right column). The 550 nm AOD assimilation usually follows the non-assimilated model closely while the multiple wavelength assimilation deviates from it and generally fits the observation better. The error reductions comparing to 10 AERONET AE are not as significant as the ones shown when comparing to MODIS AE. This is probably because MODIS retrieved AE, which has yet to be validated (Remer et al., 2005), is often inconsistent with AERONET AE (Fig. 12, right column). In this sense, obtaining observationally constrained retrievals for multiple wavelengths AOD and AE would allow assimilations to obtain additional improvements as shown 15 in Sect. 3.2. Also, further work evaluating against marine AERONET stations and/or Maritime aerosol network (MAN) data (Smirnov et al., 2011) is needed to substantiate these conclusions.

Conclusions
We developed the ability for the GSI system to perform AOD assimilation to correct 20 WRF-Chem aerosol fields when simulations are done with the MOSAIC sectional aerosol model. This enables the assimilation to impact aerosol concentrations, size and composition. In doing so, we added several new capabilities to the GSI system which include: using the AOD forward and adjoint Mie computations from WRF-Chem routines making GSI results consistent with the forecasts; adding the use of logarithmic variance matrix by the use of GSI recursive filters; and modeling aerosol water uptake as done in MOSAIC considering atmospheric conditions and the electrolytes present. The assimilation is performed using as state variable total mass within each size bin, significantly reducing computational resources used compared to using all species in all size bins. This is all demonstrated on a 3DVAR assimilation system, but it could 5 eventually be applied in more sophisticated frameworks such as 4DVAR or Kalman filter systems to make use their strengths over 3DVAR (e.g. Pagowski and Grell, 2012). This newly developed assimilation scheme was demonstrated in a regional forecast application for one month over California and its surroundings. The first set of assimilation experiments explored the use of observationally constrained AOD retrievals 10 (NASA NNR and NRL-UND) against using operational MODIS 550 nm dark target data. All three assimilations decreased global error and biases by improving forecasts on a large fraction of PM 2.5 and AOD monitoring ground stations. The assimilation of observationally constrained retrievals had consistently better performance compared to the operational MODIS data as they corrected the spatial biases and quality con-15 trolled odd retrievals, with the NASA NNR producing the higher error reductions (due to a larger amount of data) and the NRL-UND showing the higher fraction of PM 2.5 stations improved (96 %, due to the more restrictive quality control applied). These assimilation experiments did not change the overall aerosol composition, thus degrading model performance for single aerosol species that had an opposite bias to the global 20 tendency. Improvements in the non-assimilated estimates are necessary to correct this issue, which could be achieved in the study case by incorporating missing SO 2 emissions and processes not modeled such as secondary organic aerosol formation.
A second set of experiments assessed the impact of assimilating fine mode and multiple-wavelength AOD. Results showed that while single wavelength assimilation 25 did not significantly change size distributions, assimilation of additional data selectively modified aerosol at different vertical layers and changed size distributions, producing a better fit to the Angstrom Exponent (AE), an indicator of aerosol particle size distributions. The inclusion of fine AOD could not outperform the assimilation of just total AOD when comparing AOD burdens, possibly due to a mismatch between the fine mode fraction definition on model and retrieval. On the other hand, forecasts including multiple wavelengths in the assimilation further reduced errors for MODIS 550 nm and 870 nm AOD and simultaneously improved the 550-870 nm AE. The use of multiple wavelengths in the assimilation was also found to have positive influence on predic-5 tions at coastal AERONET sites. However, AE error reductions were not as significant as when evaluating with MODIS AE, possible due to an inaccurate performance of the MODIS against AERONET AE.
In this paper we showed the value of assimilating observationally constrained AOD and multiple wavelength data over assimilation of off-the-shelf 550 nm AOD products.

Tables Figures
Back Close

Full Screen / Esc
Printer-friendly Version