In an attempt to improve the forecasting of atmospheric aerosols,
the ensemble square root filter algorithm was extended to simultaneously
optimize the chemical initial conditions (ICs) and emission input. The forecast
model, which was expanded by combining the Weather Research and Forecasting
with Chemistry (WRF-Chem) model and a forecast model of emission scaling
factors, generated both chemical concentration fields and emission scaling
factors. The forecast model of emission scaling factors was developed by
using the ensemble concentration ratios of the WRF-Chem forecast chemical
concentrations and also the time smoothing operator. Hourly surface fine
particulate matter (PM
Aerosol prediction by regional air quality model in heavy polluted regions is challenging due to many factors. In addition to the deficiency of chemistries, the uncertainties of primary and precursor emissions and the initial conditions (ICs) also limit the forecast accuracy. Data assimilation (DA), which is used to improve the ICs of aerosols and to optimize data on aerosol emissions, has been shown to be one of the most effective ways to improve the forecasting of aerosol pollution.
From the perspective of reducing the uncertainties in the ICs for aerosols, recent efforts have focused on assimilating aerosol observations using optimal interpolation (Collins et al., 2001; Yu et al., 2003; Adhikary et al., 2008; Tombette et al., 2009; Lee et al., 2013) or variational (Kahnert, 2008; Zhang et al., 2008; Benedetti et al., 2009; Pagowski et al., 2010; Liu et al., 2011; Schwartz et al., 2012; Li et al., 2013; Jiang et al., 2013; Saide et al., 2013) DA algorithms. Ensemble-based DA algorithms, such as the ensemble Kalman filter (EnKF) (Sekiyama et al., 2010; Schutgens et al., 2010a, b; Pagowski and Grell, 2012; Dai et al., 2014; Rubin et al., 2016; Ying et al., 2016; Yumimoto et al., 2016) and the hybrid variational–ensemble DA approach (Schwartz et al., 2014) have also been applied to aerosol predictions. All these studies have shown that DA is one of the most effective ways of improving aerosol forecasting through assimilating aerosol observations from multiple sources (e.g. ground-based observations and satellite measurements) to update the chemical ICs.
Numerous studies have used DA approaches to estimate or improve source
emissions. The EnKF is one of the most popular DA algorithms used to improve
estimates of aerosols and gas-phase emissions, such as NO
The optimization of chemical ICs and pollution emissions can improve aerosol
forecasts and therefore further improvements are likely to be achieved by
simultaneously optimizing the chemical ICs and emissions. Tang et al. (2011)
reported that the simultaneous adjustment of the ICs of O
We developed a system to adjust the chemical ICs and source emissions
jointly within an EnKF system coupled to the Weather Research and
Forecasting with Chemistry (WRF-Chem) model (Grell et al., 2005). We then
applied this system to assimilate hourly surface PM
The remainder of the paper is organized as follows. Section 2 describes this
DA system in detail and Sect. 3 describes the PM
For a chemical model like WRF-Chem, the emissions are the model forcing (or
boundary condition) rather than model states. Therefore, a forecasting
model,
Version 3.6.1 of the WRF-Chem model (Grell et al., 2005) was used to forecast the aerosol and chemical species. WRF-Chem is an online model with the fully coupled chemical and meteorological components.
Most of the WRF-Chem settings were the same as those reported in Liu et al. (2011):
the Goddard Chemistry Aerosol Radiation and Transport (GOCART)
aerosol scheme coupled with the Regional Atmospheric Chemistry Mechanism for
gaseous chemical mechanisms, the WRF single-moment five-class microphysics
scheme, the Rapid Radiative Transfer Model longwave and Goddard shortwave
radiation schemes, the Yonsei University (YSU) boundary layer scheme, the
Noah land surface model and the Grell-3D cumulus parameterization. For the
GOCART aerosol scheme, the aerosol species include 14 defined aerosol
species and a 15th variable representing unspeciated aerosol
contributions (
Figure 1 illustrates the model computational domain. It has 120
Locations of 77 PM
With respect to the emissions, the hourly prior anthropogenic emissions were based on the monthly regional emission inventory in Asia (Zhang et al., 2009) for the year 2006 interpolated to the model grid. The power generator emissions were interpolated for the lowest eight vertical levels (Woo et al., 2003; de Meij et al., 2006; Wang et al., 2010). Other anthropogenic emissions were assigned totally to the first level. Emissions are very small above 500 m for all pollutants. In order to keep objective for the prior anthropogenic emissions, no time variation was added. Thus, the hourly prior anthropogenic emissions were constant. The biogenic (Guenther et al., 1995), dust (Ginoux et al., 2001), dimethyl sulfide and sea salt emissions (Chin et al., 2000, 2002) were calculated online.
As no suitable dynamic model was available to forecast the emission scaling
factors, a persistence forecasting operator served as the forecast model for
the scaling factors, similar to the method used by Peng et al. (2015) for
CO
If the ensemble members of the updated chemical fields
The ensemble spreads of
As the concentrations were closely related to the emissions both locally and
in the upwind regions and there is no suitable dynamic model available to
forecast the emission scaling factors, the inflated concentration ratios
To incorporate the useful information from the previous times, the previous
DA cycles' analysis scaling factors,
The ensemble members of the emissions were calculated according to
It is noted although the method is very similar to that used by Peters et
al. (2007) and Peng et al. (2015) for CO
The EnSRF algorithm was introduced by Whitaker and Hamill (2002) and its expansion to analysing aerosol ICs was described by Schwartz et al. (2014). The traditional EnKF with perturbed observations (Evensen, 1994) introduces sampling errors by perturbing the observations. In contrast to the traditional EnKF, the EnSRF (Whitaker and Hamill, 2002) and the ensemble adjustment Kalman filter (EAKF; developed by Anderson, 2001) obviate the need to perturb the observations. The local ensemble Kalman filtering (LEKF), a kind of EnSRF, was presented by Ott et al. (2002, 2004). It was computationally more efficient compared to the traditional EnKF since it simultaneously assimilates the observations within a spatially local volume independently. The local ensemble transform Kalman filter (LETKF; Hunt et al., 2007) integrates the advantages of the ensemble transform Kalman filter (ETKF; developed by Bishop et al., 2001) and the LEKF. The computational cost of LETKF is much lower than that of the original LEKF because the former does not require an orthogonal basis. Though LETKF has more advantages, we still chose the same EnSRF as Schwartz et al. (2014) because we did not need to extend it to analysing aerosol ICs, very similar to Schwartz et al. (2014).
Following the notation of Ide et al. (1997), given an
Note that for the joint analysis of ICs and emissions, the state vector
In this work, a 50-member ensemble was chosen, following Schwartz et al. (2012)
and Whitaker and Hamill (2002). Covariance localization forced EnSRF
analysis increments to zero 1280 km from an observation in the horizontal
and one scale height to reduce spurious correlations due to sampling error
for all control variables, similar to Pagowski et al. (2012) and Schwartz
et al. (2012, 2014). In addition, posterior (after assimilation)
multiplicative inflation following Whitaker and Hamill (2012) was applied
aiming to maintain ensemble spread for only the concentration analysis. The
inflation factor
As stated in Sect. 2.2, the state variables of the analysis of the ICs
were the 15 WRF-Chem/GOCART aerosol variables. The PM
From the perspective of the optimization of emissions, four species of
emission scaling factors (
The direct sources of PM
Figure 2b shows the workflow of the DA system. The steps in this workflow
are as follows.
The persistence forecasting operator The ensemble members of the emissions,
Natural emissions, such as dust and sea salt emissions, were not perturbed
explicitly when the forecast emissions were generated. However, emissions of
dust and sea salt were parameterized within the GOCART model (Chin et al.,
2002). Within the DA system, varying meteorology across the members
implicitly perturbed dust and sea salt emissions. Forced by the changed emissions
( The model-simulated PM In the assimilation step, the state variables, the concentrations of
14 defined aerosol species and a 15th unspeciated aerosol, and the four species
of emission scaling factors, After the assimilation step, the optimized emissions
(
Hourly averaged surface PM
The observation error covariance matrix
The PM
Two parallel experiments were performed to evaluate the impact of PM
The initialization and spin-up procedures were identical to those reported by Schwartz et al. (2014). The ICs and lateral boundary conditions (LBCs) for the meteorological fields were provided by the National Centers for Environmental Prediction Global Forecast System (GFS).
The initial meteorological fields were created at 00:00 UTC on 1 October 2014 by interpolating the GFS analyses onto the model domain. The 50 ensemble members were then generated by adding Gaussian random noise with a zero mean and static background error covariances (Torn et al., 2006) to the temperature, water vapour, velocity, geopotential height and dry surface pressure fields. The ICs of each member were zero in the initial aerosol fields, representing clean conditions as described by Liu et al. (2011).
The LBCs for the meteorological fields were then interpolated from the GFS analyses from 00:00 UTC on 1 October to 00:00 UTC on 16 October 2014 and perturbed similarly to the initial fields at 00:00 UTC on 1 October 2014. The aerosol LBCs of each member for all experiments were idealized profiles embedded within the WRF-Chem model.
Fifty-member emissions were created by adding random noise to the
anthropogenic emissions, same as reported by Schwartz et al. (2014):
Before the first DA cycle, a 50-member ensemble of 4-day WRF-Chem forecasts was performed from 00:00 UTC on 1 October to 23:00 UTC on 4 October 2014 using the perturbed ICs at 00:00 UTC on 1 October 2014, the corresponding perturbed LBCs and the emissions. Then a 50-member ensemble aerosol forecast at 00:00 UTC on 5 October 2014 was produced.
Two DA experiments were performed. One was the pure assimilation of chemical ICs (hereafter expC), while the other was the joint adjustment of chemical ICs and source emissions (hereafter expJ). Both DA experiments had the same settings except for the emissions. They were conducted from 00:00 UTC on 5 October 2014 to 00:00 UTC on 16 October 2014. The assimilation cycle interval was 1 h.
In the first DA cycle in expJ, the first 50 ensemble chemical fields were
drawn from the WRF-Chem ensemble forecasts valid at 00:00 UTC on 5 October 2014,
as described in Sect. 4.1. Using the ensemble aerosol forecasts, the prior
emission scaling factors
In expC, the first chemical fields were also drawn from the WRF-Chem
ensemble forecasts valid at 00:00 UTC on 5 October 2014. Then, the state vector
During the WRF-Chem forecast step of the subsequent assimilation cycles for both
experiments, the ICs for the chemical variables of each member were drawn
from the updated chemical fields of the previous cycle. The aerosol LBCs of
each member for all experiments were idealized profiles embedded within the
WRF-Chem model. As for the meteorological ensemble fields, the LBCs were
prepared in advance as depicted in Sect. 4.1; the ICs of each member of
the meteorological fields were drawn from the forecast meteorological fields
of the previous cycle before re-centering with the GFS analysis because we
do not do meteorological analysis:
As stated in the first paragraph in this section, the settings of expC were the same as those in expJ except for the emissions. In expJ, the ensemble anthropogenic emissions were generated by using emission scaling factors, while in expC the ensemble anthropogenic emissions were prepared by adding random noise, as stated in Sect. 4.1.
The control experiment was conducted for the same period as the assimilation
experiment and the simulation cycle period was 1 h, as in the assimilation
experiment. The first initial chemical fields were extracted from the
ensemble mean valid at 00:00 UTC on 5 October 2014. In the subsequent simulation
process, the ICs for the chemical fields were from the previous cycle's 1 h
forecast. The LBCs and ICs for the meteorological fields were updated by
interpolating the GFS analyses. The emissions were the prescribed emissions
Statistics for both expJ and expC were computed using the ensemble mean prior (background) and posterior (analysis) fields (average of the 50-member ensemble). The ensemble performances were first examined. Output from the first day of the cycling DA configurations was excluded from all verification statistics to allow the ensemble fields to “spin up” from the initial ensemble.
As the measurement coverage is an important factor that may determine the performance in DA, we primarily focused our attention on the results from three sub-regions with comparatively dense observational coverage (Fig. 1): the Beijing–Tianjin–Hebei region (JJJ; 12 stations for assimilation and 12 stations for verification); the Yangtze River delta (YRD; 24 stations for assimilation and 24 stations for verification); and the Pearl River delta (PRD; 9 stations for assimilation and 9 stations for verification).
It is important to assess the ensemble performance for an ensemble-based DA
system. In a well-calibrated system, a comparison of the prior ensemble mean
root-mean-square error (RMSE) with respect to the observations should equal
the prior “total spread” (square root of the sum of ensemble variance and
observation error variance) (Houtekamer et al., 2005). Figure 3 shows the
time series for the prior ensemble mean RMSE and the total spread for
PM
The magnitudes of the ensemble spread of the emission scaling factors of the
joint DA experiment were important for emission inversion. They were very
stable throughout the
Comparison of the surface PM
Time series of prior ensemble mean RMSE and total spread for
PM
Spatial distribution of the PM
To evaluate quantitatively the impact of the ensemble assimilation system on
the ICs, the mean errors (bias), RMSEs and correlation coefficient (CORR) of
the assimilation experiment and the control run were first analysed. These
statistics were calculated against independent observations over all the
analyses from 6 to 16 October 2014. Table 1 shows that the bias magnitudes
of the control run were 15.9 and 20.6
It is interesting to note that expC has better RMSE and CORR than expJ but poor bias in JJJ and expC has better bias and RMSE than expJ but poor CORR in PRD. Maybe the small number of samples caused the uncertainties of the statics. However, the differences were very small. The analyses of both experiments were very similar.
Hourly area-averaged time series of emission scaling factors
(black) extracted from the ensemble mean of the analysed
Then the analysis increments (i.e.
To determine the impact of assimilating PM
Hourly area-averaged time series of emission scaling factors
extracted from the ensemble mean of the analysed
Spatial distribution of
Figure 5 also shows that although the prior emissions
The NO, SO
Figure 7 shows the spatial distribution of the time-averaged scaling factors
These patterns are consistent with those in Fig. 5. Negative differences
were obtained in most areas of the YRD and PRD, indicating that the
PM
As the economy in China has developed, the spatiotemporal distribution of emissions has changed as a result of changes in energy consumption, the structure of the energy market and advances in technology. Therefore although this inventory of emissions may have correctly described anthropogenic emissions in 2006 when it was constructed, it is not representative of the anthropogenic emissions in 2014. Theoretically, the assimilated emissions should reduce the uncertainty in the prior emissions as a result of the application of observations. Different from the reports of standard national emission inventories by governments in USA, Europe and other countries, the rapid economic development and complexity of emission sources in China have led to large uncertainties in the current emission inventories, even for the latest version. Thus it is impossible for us to conduct a direct evaluation on emissions.
Although we had no direct emission observations to evaluate the analysing
emissions, which was a challenging to many emission inversion research teams
(e.g. Tang et al., 2011; Miyazaki et al., 2012; Ding et al., 2015; McLinden
et al., 2016), the improvement of emissions can be verified in terms
of two aspects: the diurnal variation and the location of increased
emissions. The diurnal variation in the assimilated emissions verified this
statement to some extent. Especially in the PRD and YRD,
Spatial distribution of
However, the analysis emissions are only a mathematical optimum. They are
influenced greatly by the model errors and the observation errors. In
addition, only surface PM
For the assimilation experiment, 48 h forecasts were performed at each 00:00 UTC from 6 to 16 October 2014 with the hourly forecast output for both assimilation experiments. For the verification forecasting experiment for expJ (hereafter fcJ), the ensemble mean of the analysed ICs and emissions of expJ were used in this longer-range model forecast. For the verification forecasting experiment for expC (hereafter fcC), the ensemble mean of the analysed ICs of expC and the prescribed anthropogenic emissions were used.
In order to get a more visualized picture of the impact of DA for both
assimilation experiments, time series of the hourly PM
Time series of the hourly PM
The bias and the RMSE of the surface PM
Bias of surface PM
The improvements in the surface PM
As for expC, it seemed that large improvements in the surface PM
Both DA systems did not perform as well in the JJJ region as in the YRD and PRD. Relatively smaller improvements were achieved in the first 24 h forecast but then no improvements were achieved afterwards in JJJ. One possible reason for this result may be systematic errors due to chemistry mechanism in WRF-Chem. The sources of the aerosols are so complex that our knowledge of their formation mechanisms is far from clear and large uncertainties still exist in the model simulations. Chemical transport models have a tendency to underestimate PM concentrations, especially during episodes of heavy pollution (Denby et al., 2007) due to some missing reactions (Wang et al., 2014; Zhang et al., 2015; Zheng et al., 2015; Chen et al., 2016). Another reason can be attributed to the forecast meteorological fields. There were still large uncertainties, especially when the boundary layer was stable and the wind speed was very small during episodes of heavy pollution. As a result, a large bias may be obtained in forecasts of heavy pollution given the ICs and emission inventories achieved from the joint assimilation. Another reason may be the sparse coverage of measurements. There were only 12 sites in the JJJ region (Fig. 1) and the measurement coverage was much sparser than in the YRD or PRD.
The EnSRF algorithm was extended to adjust the chemical ICs and the primary
and precursor emissions to improve forecasts for surface PM
There are still some limitations in this study. Firstly, we use the default
monthly anthropogenic emissions as the prior emissions and no time variation
was added to keep objective, since no resolution of temporal allocations at
shorter but critical (e.g. day-of-week, diurnal) scales is available. As
shown in earlier work, the constant emissions will worsen the chemical
forecasts (de Meij et al., 2006; Wang et al., 2010).
For the joint DA system
itself, it cannot benefit from the constant prior anthropogenic emissions.
The normalized RMSE in Fig. 10g decreased due to the poor forecasts of
control run. The control run will perform better when variable emissions
within the day are allowed, especially during the night. As a result, the
relative reduction in RMSE could not be so large during the night. Secondly,
no correlations between emissions variables were considered when perturbing
the emissions, which led to the reduction of the correlations between
the variables. Thus, the chemical forecast will deviate from the truth to
some degree. Fortunately, the perturbed emissions were only used in the
initialization and spin-up experiment and expC. Therefore, there were no
impacts on expJ and the control run except for expC. Thirdly,
This study represents the first step in the simultaneous optimization of
chemical ICs and emissions and only surface PM
Data used in this publication can be accessed by contacting the authors Zhen Peng (pengzhen@nju.edu.cn) and Zhiquan Liu (liuz@ucar.edu).
The authors declare that they have no conflict of interest.
We thank three anonymous reviews for their helpful comments. This work was supported by the National Key Technologies Research and Development program of China (2016YFC0202102), the Strategic Priority Research Program – Climate Change: Carbon Budget and Relevant Issues (XDA05040404), and the National Natural Science Foundation of China (41575141). NCAR is sponsored by US National Science Foundation. Edited by: T. Takemura Reviewed by: three anonymous referees