Improving PM 2 . 5 forecast over China by the joint adjustment of initial conditions and source emissions with an ensemble Kalman filter

In an attempt to improve the forecasting of atmospheric aerosols, the ensemble square root filter algorithm was extended to simultaneously optimize the chemical initial conditions (ICs) and emission input. The forecast model, which was expanded by combining the Weather Research and Forecasting with Chemistry (WRF-Chem) model and a forecast model of emission scaling factors, generated both chemical concentration fields and emission scaling factors. The forecast model of emission scaling factors was developed by using the ensemble concentration ratios of the WRF-Chem forecast chemical concentrations and also the time smoothing operator. Hourly surface fine particulate matter (PM2.5) observations were assimilated in this system over China from 5 to 16 October 2014. A series of 48 h forecasts was then carried out with the optimized initial conditions and emissions on each day at 00:00 UTC and a control experiment was performed without data assimilation. In addition, we also performed an experiment of pure assimilation chemical ICs and the corresponding 48 h forecasts experiment for comparison. The results showed that the forecasts with the optimized initial conditions and emissions typically outperformed those from the control experiment. In the Yangtze River delta (YRD) and the Pearl River delta (PRD) regions, large reduction of the root-mean-square errors (RMSEs) was obtained for almost the entire 48 h forecast range attributed to assimilation. In particular, the relative reduction in RMSE due to assimilation was about 37.5 % at nighttime when WRF-Chem performed comparatively worse. In the Beijing– Tianjin–Hebei (JJJ) region, relatively smaller improvements were achieved in the first 24 h forecast but then no improvements were achieved afterwards. Comparing to the forecasts with only the optimized ICs, the forecasts with the joint adjustment were always much better during the night in the PRD and YRD regions. However, they were very similar during daytime in both regions. Also, they performed similarly for almost the entire 48 h forecast range in the JJJ region.


Introduction
Aerosol prediction by regional air quality model in heavy polluted regions is challenging due to many factors.In addition to the deficiency of chemistries, the uncertainties of primary and precursor emissions and the initial conditions (ICs) also limit the forecast accuracy.Data assimilation (DA), which is used to improve the ICs of aerosols and to optimize data on aerosol emissions, has been shown to be one of the most effective ways to improve the forecasting of aerosol pollution.
Numerous studies have used DA approaches to estimate or improve source emissions.The EnKF is one of the most popular DA algorithms used to improve estimates of aerosols and gas-phase emissions, such as NO x , volatile organic compounds and SO 2 (van Loon et al., 2000;Heemink and Segers, 2002;Zhang et al., 2005;Barbu et al., 2009;Sekiyama et al., 2010;Huneeus et al., 2012Huneeus et al., , 2013;;Schutgens et al., 2012;Miyazaki et al., 2014).Variational DA algorithms have also been applied to constrain emissions of air pollution, such as black carbon, organic carbon, dust, NH 3 , SO x and NO x (Hakami et al., 2005;Elbern et al., 2007;Henze et al., 2007Henze et al., , 2009;;Yumimoto et al., 2007Yumimoto et al., , 2008;;Dubovik et al., 2008;Wang et al., 2012;Guerrette and Henze, 2015).These studies have indicated that DA can efficiently reduce the uncertainty in the emission inventories and lead to improvements in the forecasting of air quality (Mijling and van der A, 2012).
The optimization of chemical ICs and pollution emissions can improve aerosol forecasts and therefore further improvements are likely to be achieved by simultaneously optimizing the chemical ICs and emissions.Tang et al. (2011) reported that the simultaneous adjustment of the ICs of O 3 , NO x and volatile organic compounds and the emissions of NO x and volatile organic compounds produced overall better performance in both the 1 and 24 h ozone forecasts than the adjustment of pure ICs or emissions.Miyazaki et al. (2012) reported that the simultaneous adjustment of emissions and concentrations is a powerful approach to correcting the tropospheric ozone budget and profile analyses.
We developed a system to adjust the chemical ICs and source emissions jointly within an EnKF system coupled to the Weather Research and Forecasting with Chemistry (WRF-Chem) model (Grell et al., 2005).We then applied this system to assimilate hourly surface PM 2.5 measurements over China in early October 2014.
The remainder of the paper is organized as follows.Section 2 describes this DA system in detail and Sect. 3 describes the PM 2.5 observations.Then the experimental designs are introduced in Sect. 4. Finally, the surface PM 2.5 observations assimilation results are presented in Sect. 5 before concluding in Sect.6.

Forecast model
For a chemical model like WRF-Chem, the emissions are the model forcing (or boundary condition) rather than model states.Therefore, a forecasting model, M, was developed to forecast the emission scaling factors (representing emissions) as well as the aerosol concentrations.This model combines the WRF-Chem model and the forecast model of emission scaling factors.

WRF-Chem model
Version 3.6.1 of the WRF-Chem model (Grell et al., 2005) was used to forecast the aerosol and chemical species.WRF-Chem is an online model with the fully coupled chemical and meteorological components.
Most of the WRF-Chem settings were the same as those reported in Liu et al. (2011): the Goddard Chemistry Aerosol Radiation and Transport (GOCART) aerosol scheme coupled with the Regional Atmospheric Chemistry Mechanism for gaseous chemical mechanisms, the WRF singlemoment five-class microphysics scheme, the Rapid Radiative Transfer Model longwave and Goddard shortwave radiation schemes, the Yonsei University (YSU) boundary layer scheme, the Noah land surface model and the Grell-3D cumulus parameterization.For the GOCART aerosol scheme, the aerosol species include 14 defined aerosol species and a 15th variable representing unspeciated aerosol contributions (P 25 ).The 14 defined aerosol species are sulfate, hydrophobic and hydrophilic organic carbon (OC 1 and OC 2 respectively), hydrophobic and hydrophilic black carbon (BC 1 and BC 2 respectively), dust in five particle size bins (effective radii of 0.5, 1.4, 2.4, 4.5 and 8.0 µm, referred to as D 1 , D 2 , D 3 , D 4 and D 5 respectively) and sea salt in four particle size bins (effective radii of 0.3, 1.0, 3.25 and 7.5 µm for dry air, referred to as S 1 , S 2 , S 3 and S 4 respectively).
Figure 1 illustrates the model computational domain.It has 120 × 120 horizontal grid scales at a 40.5 km spacing by the lambert conform map projection centred at (35 • N, 105 • E).There are 57 vertical levels with the model top at 10 hPa, about 12 layers within the planetary boundary layer (among them the lowest 8 layers were under 500 m) and the first layer centred at ∼ 12 m.
With respect to the emissions, the hourly prior anthropogenic emissions were based on the monthly regional emission inventory in Asia (Zhang et al., 2009) for the year 2006 interpolated to the model grid.The power generator emissions were interpolated for the lowest eight vertical levels (Woo et al., 2003;de Meij et al., 2006;Wang et al., 2010).Other anthropogenic emissions were assigned totally to the first level.Emissions are very small above 500 m for all pollutants.In order to keep objective for the prior anthropogenic emissions, no time variation was added.Thus, the hourly prior anthropogenic emissions were constant.The biogenic (Guenther et al., 1995), dust (Ginoux et al., 2001), dimethyl sulfide and sea salt emissions (Chin et al., 2000(Chin et al., , 2002) ) were calculated online.

Forecast model of scaling factors
As no suitable dynamic model was available to forecast the emission scaling factors, a persistence forecasting operator served as the forecast model for the scaling factors, similar to the method used by Peng et al. (2015) for CO 2 emission inversion.Figure 2a shows the flowchart for the persistence forecasting operator M SF .
If the ensemble members of the updated chemical fields C a i,t−1 (the subscript "i" refers to the ith ensemble member, the superscript "a" refers to the analysis and "t" refers to the time) and the forecast emissions E f i,t−1 (the superscript "f" refers to the forecast) in the previous assimilation cycle are known, then the chemical fields C f i,t at time t can be generated via WRF-Chem (Fig. 2b).In the actual process, C f i,t were available in the previous assimilation cycle, so we did not need to perform the ensemble forecasts again.A dotted box was used in Fig. 2a to indicate that the ensemble forecasts were not performed in real process.The ensemble concentration ratios κ i,t , (i = 1, . . ., N ) are then calculated using where is the ensemble mean of the forecast.The ensemble mean of κ i,t is where κ i,t are numbers distributed around 1 and with ensemble mean values of 1.The ensemble spreads of κ i,t , (i = 1, . . ., N) may be small and therefore covariance inflation is used to maintain them at a certain level: (3) In Peng et al. (2015), the CO 2 DA system worked comparatively well when the ensemble spread of λ a i,t ranged from 0.05 to 1.25 for β = 60, 70, 75, 80.The assimilated CO 2 fluxes deviated markedly from the "true" CO 2 fluxes when the ensemble spread of λ a i,t were too small for β = 10, 50 or when the ensemble spread of λ a i,t were too large for β = 100.Therefore, in this work, β = 1.5 was chosen to make ensure the ensemble spread of (κ i,t ) inf ranged from 0.1 to 1.25.Same as κ i,t , the ensemble mean values of (κ i,t ) inf are 1.It is noted that perhaps there are very few negative values for (κ i,t ) inf after inflation.A quality control procedure is performed for (κ i,t ) inf before further appliance.All these negative data were set as 0.001 in this work.There was no special reason to set them as 0.001.It is also fine to set them as 0. Then (κ i,t ) inf were re-centred to ensure the ensemble mean values of (κ i,t ) inf were all 1.
As the concentrations were closely related to the emissions both locally and in the upwind regions and there is no suitable dynamic model available to forecast the emission scaling factors, the inflated concentration ratios (κ i,t ) inf serve as the prior emission scaling factors λ p i,t : The above equation is not supported according to the mass conservation equation but just for the purpose to generate the ensemble emissions.As with (κ i,t ) inf , λ p i,t are numbers distributed around 1. From the perspective of generating the ensemble emissions, they can play the same role as other data, such as the random numbers created by using the standard normal distribution function.However, there are correlations among the grid points of (κ i,t ) inf because (κ i,t ) inf are calculated through a short-term forecast of WRF-Chem.Thus, λ p i,t have the same correlations as (κ i,t ) inf , while the random numbers are totally different; there are no correlations unless they are generated under certain correlations.To incorporate the useful information from the previous times, the previous DA cycles' analysis scaling factors, λ a i,t−M+1 , . . ., λ a i,t−2 , λ a i,t−1 , and the prior scaling factor λ p i,t were used to estimate λ f i,t by the time smooth operator; namely, Here, M is the time window of the smooth operator.In this study, a value of M = 4 (h) was chosen.According to the smooth operator, the ensemble mean values of λ f i,t depend on the ensemble mean of λ a i,t−M+1 , . . ., λ a i,t−2 , λ a i,t−1 , λ p i,t , where the ensemble means of λ p i,t are all 1.After multiple iterations, the smooth operator can give comparatively good estimation for λ f i,t since anthropogenic emissions are stable at a certain timescale (Mijling et al., 2012).It is a compromise between prescribed prior emissions and letting the system propagate all observation information from one step to the next without any guidance (Peters et al., 2007), for the case M = 4.
The ensemble members of the emissions were calculated according to where E i,t is the ith ensemble member of the emissions for each grid at time t, λ i,t represents the scaling factors and E p t is the prescribed emission, which can be obtained from the emission inventories.It is noted that the correlations among the grid points of the prior emissions depend on λ p i,t .These correlations may deviate far from the truth but we have no other suitable substitute.However, the correlations among the grid points of the forecast emissions should be more or less close to the truth due to the appliance of the smooth operator after multiple iterations.
It is noted although the method is very similar to that used by Peters et al. (2007) and Peng et al. (2015) for CO 2 emission inversion, it is still of novelty for applications in aerosol anthropogenic emissions.In Peters et al. (2007), λ p i,t were all 1.Only natural CO 2 emissions (i.e.biospheric and oceanic emissions) were assimilated at the ecological scale due to the "signal-to-noise" problem.Thus, the uncertainty of anthropogenic and other CO 2 emissions was ignored.Besides, the framework is more advanced compared to our previous work.In Peng et al. (2015), in order to generate λ p i,t , a set of ensemble forecasts were performed from time t to t + 1 to produce the CO 2 concentration fields, forced by the prescribed net CO 2 surface fluxes with the previous assimilated concentration fields as initial conditions.That means that the ensemble forecast was performed twice in that DA system, which time consuming.However, in order to save computing time, we used the chemical fields C f i,t available in the previous assimilation cycle to calculate λ p i,t in this work.Thus, WRF-Chem runs to forecast only once during a DA cycle.

Ensemble square root filter (EnSRF)
The EnSRF algorithm was introduced by Whitaker and Hamill (2002) and its expansion to analysing aerosol ICs was described by Schwartz et al. (2014).The traditional EnKF with perturbed observations (Evensen, 1994) introduces sampling errors by perturbing the observations.In contrast to the traditional EnKF, the EnSRF (Whitaker and Hamill, 2002) and the ensemble adjustment Kalman filter (EAKF; developed by Anderson, 2001) obviate the need to perturb the observations.The local ensemble Kalman filtering (LEKF), a kind of EnSRF, was presented by Ott et al. (2002Ott et al. ( , 2004)).It was computationally more efficient compared to the traditional EnKF since it simultaneously assimilates the obser-vations within a spatially local volume independently.The local ensemble transform Kalman filter (LETKF; Hunt et al., 2007) integrates the advantages of the ensemble transform Kalman filter (ETKF; developed by Bishop et al., 2001) and the LEKF.The computational cost of LETKF is much lower than that of the original LEKF because the former does not require an orthogonal basis.Though LETKF has more advantages, we still chose the same EnSRF as Schwartz et al. (2014) because we did not need to extend it to analysing aerosol ICs, very similar to Schwartz et al. (2014).
Following the notation of Ide et al. (1997), given an m-dimensional background model forecast vector x b , a pdimensional observation vector y o and an operator H that converts the model state to the observation states, we expressed the variables as an ensemble mean (denoted by an overbar) and a deviation from the mean (denoted by a prime).Thus, the ensemble mean x a of the analysed state x a and the deviations x a from the ensemble mean are updated separately by where K is the traditional Kalman gain matrix and K is the gain used to update the deviations from the ensemble mean.These are given by where background error covariance matrix and R is the p × pdimensional diagonal observation error covariance matrix.In real applications, P b H T and H P b H T will be approximated using the background ensemble; namely, In Eqs. ( 11) and ( 12), N is the ensemble size.
Note that for the joint analysis of ICs and emissions, the state vector x is the joint vector of the mass concentration C and the emission scaling factor λ, i.e. x = [C, λ] T .In this study, the state variables of the analysis of the ICs were the 15 WRF-Chem/GOCART aerosol variables, same as that reported by Schwartz et al. (2012).The state variables of the emission scaling factors include λ PM 2.5 , λ SO 2 , λ NO and λ NH 3 and are described in Sect.2.3.1.After each ensemble analysis, the ensemble forecasts were performed with the corresponding models to advance C and λ to the next analysis time.
In this work, a 50-member ensemble was chosen, following Schwartz et al. (2012) and Whitaker and Hamill (2002).Covariance localization forced EnSRF analysis increments to zero 1280 km from an observation in the horizontal and one scale height to reduce spurious correlations due to sampling error for all control variables, similar to Pagowski et al. (2012) and Schwartz et al. (2012Schwartz et al. ( , 2014)).In addition, posterior (after assimilation) multiplicative inflation following Whitaker and Hamill (2012) was applied aiming to maintain ensemble spread for only the concentration analysis.The inflation factor α = 1.2 was chosen as in Pagowski et al. (2012) and Schwartz et al. (2012Schwartz et al. ( , 2014)).Additive or prior inflation was not employed.As for the emission scaling factor λ, the inflation was not used at this step.

State variables
As stated in Sect.2.2, the state variables of the analysis of the ICs were the 15 WRF-Chem/GOCART aerosol variables.The PM 2.5 observation operator was the same as that described by Schwartz et al. (2012) and expressed as where ρ d represents the dry air density, which is multiplied by the mixing ratios of aerosol species (in µg kg −1 ) to convert the units to µg m −3 for consistency with the observations.From the perspective of the optimization of emissions, four species of emission scaling factors (λ PM 2.5 , λ SO 2 , λ NO and λ NH 3 ) were also considered as the state variables of the DA system.Atmospheric inorganic aerosols are not only from the primary emissions but also from secondary processes -chemical and thermodynamic transformations from the gas-phase precursors.Therefore, not only the primary sources of PM 2.5 but also the sources of the gas-phase precursors need to be optimized.In this study, the sources of SO 2 , NO x and NH 3 (E SO 2 , E NO and E NH 3 ), which have a large impact on the distribution of PM 2.5 , were also optimized in addition to the primary sources of PM 2.5 .It is noted that for the optimization of the emission scaling factors, M SF serves as the forecast model and the observation operator reflects the combined information of emissions (in the format of λ in Eq. 6), the physics and chemistry processes in WRF-Chem simulations and the transformation PM 2.5 from model space to observation space (Eq.13).
The direct sources of PM 2.5 include the unspeciated primary sources of PM 2.5 E PM 2.5 , sulfate E SO 4 , nitrate E NO 3 , organic compounds E org and elemental compounds E BC ; all of them are given in two modes (the nuclei and accumulation modes, represented as i and j in the subscripts respectively).The ratios between the nuclei and accumulation modes were the same as in the suggested emission process for National Emission Inventory in WRF-Chem (Freitas et al., 2011).The formulae of sulfate and nitrate emissions in the model are as follows: where E EC represents elemental carbon and E ORG organic compounds, and a = 0.074 and b = 0.038 were chosen based on the internal emissions and observational data.In the DA process, the first six species of direct sources of emissions (E PM 2.5 i , E PM 2.5 j , E SO 4 i , E SO 4 j , E NO 3 i and E NO 3 j ), which may have larger uncertainties in heavy polluted events, were updated according to the variation of λ PM 2.5 .E PM 2.5 i and E PM 2.5 j were directly updated according to the variation in λ PM 2.5 .The emissions (E SO 4 i , E SO 4 j , E NO 3 i and E NO 3 j ) were also updated according to the variations in E PM 2.5 i and E PM 2.5 j .E EC and E ORG of the anthropogenic emissions were not assimilated, which is a limitation in this work.Besides, emissions of dust and sea salt were not assimilated.It is true that these emissions are also important for the atmosphere aerosol.The reason we did not assimilate E EC and E ORG is that only the PM 2.5 measurements are used in this DA experiment.However, the sources of the aerosols (especially organic aerosols) are so complex that our knowledge of their formation mechanisms is far from clear.Though it is technically possible to have all emissions assimilated, with such limited observations adding more control variables would cause much more uncertainties in the system which might lead to unreasonable analysis.

The ensemble members of the emissions, E
NO and E f NH 3 , are prepared according to Eq. ( 6).The corresponding emissions of i and E f NO 3 j are obtained based on Eqs. ( 15)-( 18).Other inorganic species of the anthropogenic emission, such as E EC and E ORG , are not perturbed for WRF-Chem.However, other anthropogenic emissions, such as E PM 2.5 , E SO 4 and E NO 3 , are much larger than E EC and E ORG in most areas of China, and the ensemble spreads of the aerosol concentrate largely dependent on the uncertainties of those anthropogenic emissions.Besides, model errors raised from the meteorology, the emission and the chemical model itself are compensated to some extent through the use of multiplicative inflation.In other words, the ensemble spread of the concentrations can be kept at a certain level though E EC and E ORG and are not perturbed.
Natural emissions, such as dust and sea salt emissions, were not perturbed explicitly when the forecast emissions were generated.However, emissions of dust and sea salt were parameterized within the GOCART model (Chin et al., 2002).Within the DA system, varying meteorology across the members implicitly perturbed dust and sea salt emissions.

Forced by the changed emissions (E PM
; the other emissions such as E EC and E ORG remained unchanged), WRF-Chem is run again to forecast the chemical fields ρ f with the updated chemical fields of the previous assimilation cycle as the ICs.The state variables, i.e. 15 aerosol species and four scaling factors, are then prepared.
4. The model-simulated PM 2.5 concentration at the observation space is then calculated via Eq.( 13).At this time, the state vector 5. In the assimilation step, the state variables, the concentrations of 14 defined aerosol species and a 15th unspeciated aerosol, and the four species of emission scaling factors, λ f PM 2.5 , λ f SO 2 , λ f NO and λ f NH 3 , were optimized through EnSRF.
6.After the assimilation step, the optimized emissions NO 3 i and E a NO 3 j ) were calculated according to Eqs. ( 6) and ( 15)-( 18) using the optimized scaling factors (λ a PM 2.5 , λ a SO 2 , λ a NO and λ a NH 3 ).

PM 2.5 observation data and errors
Hourly averaged surface PM 2.5 observations from the Ministry of Environmental Protection of China were assimilated.There were altogether 876 national control measurement sites over China.The PM 2.5 observation sites spanned most of central and eastern China but were primarily located in urban and suburban areas.Thus there was always more than one observation site in a particular city, which fell into the same model grid.Since we did not know the exact observation environment of the sites, we randomly selected one observation site in a city for assimilation experiment and one for verification purposes to ensure that there was at most one assimilated measurements for one model grid.Altogether 77 stations were selected for the PM 2.5 assimilation experiment and another 77 independent stations were selected for verification.Figure 1 shows the locations of 77 measurement sites used for the PM 2.5 assimilation experiment and 77 independent sites used for forecast verification.
The observation error covariance matrix R in Eq. ( 9) includes contributions from measurement and representation errors.Similar to the work of Schwartz et al. (2012), the measurement error ε 0 is defined as ε 0 = 1.5 + 0.0075 • 0 , where 0 denotes the observational values for PM 2.5 (µg m −3 ).Thus, higher PM 2.5 values were associated with larger measurement errors.Following Elbern et al. (2007), Pagowski et al. (2010) and Schwartz et al. (2012), the representativeness error ε r depends on the resolution of the model and the characteristics of the observation locations and is calculated as ε r = rε 0

√
x/L, where r is an adjustable parameter (here, r = 0.5), x is the grid spacing (here, 40.5 km), and L is the radius of influence of an observation (here, L was set to 3 km following Elbern et al. (2007), since we do not know the station type used in the work).The total PM 2.5 error (ε t ) is defined as ε t = ε 2 0 + ε 2 r .The observation errors are assumed to be uncorrelated so that R is a diagonal matrix.
The PM 2.5 observations were subject to quality control to ensure data reliability before DA.Considering that China has had intense pollution events, PM 2.5 values larger than 800 µg m −3 were classified as unrealistic and were not assimilated; observations with the ensemble mean of the first guess departure exceeding 100 µg m −3 were also omitted, following Schwartz et al. (2012).The number of observations was about 17 700.Among them, eight observations were discarded because they were larger than 800 µg m −3 and 243 (around 1.5 %) were discarded due to the latter reasons.

Experimental design
Two parallel experiments were performed to evaluate the impact of PM 2.5 DA on the analyses and forecasts of aerosols over China: an assimilation experiment and a control experiment.Both experiments used identical WRF-Chem settings and physical parameterizations.

Spin-up ensemble forecast with perturbed initial and boundary conditions
The initialization and spin-up procedures were identical to those reported by Schwartz et al. (2014).The ICs and lateral boundary conditions (LBCs) for the meteorological fields were provided by the National Centers for Environmental Prediction Global Forecast System (GFS).
The initial meteorological fields were created at 00:00 UTC on 1 October 2014 by interpolating the GFS analyses onto the model domain.The 50 ensemble members were then generated by adding Gaussian random noise with a zero mean and static background error covariances (Torn et al., 2006) to the temperature, water vapour, velocity, geopotential height and dry surface pressure fields.The ICs of each member were zero in the initial aerosol fields, representing clean conditions as described by Liu et al. (2011).
The LBCs for the meteorological fields were then interpolated from the GFS analyses from 00:00 UTC on 1 October to 00:00 UTC on 16 October 2014 and perturbed similarly to the initial fields at 00:00 UTC on 1 October 2014.The aerosol LBCs of each member for all experiments were idealized profiles embedded within the WRF-Chem model.
Fifty-member emissions were created by adding random noise to the anthropogenic emissions, same as reported by Schwartz et al. (2014): where E * ip (ηt) is the ith ensemble member for the pth emissions variable at the ηth grid point and the tth hour, and E p is the unperturbed emissions.The term σ E p is the standard deviation of all E p values and in the horizontally adjacent points of grid box η at and within 2 h of t.W is a weight that was randomly drawn from a standard Gaussian distribution and varied for each ensemble member and variable but was spatially and temporally constant.No correlations between emissions variables were considered, which was a limitation of this approach.For possible negative perturbed emissions, they were set as E * ip (η, t) = 0.001 • E p (η, t).This will increase the prescribed emissions more or less.However, only very few data were negative.So, this influence can be negligible.
Before the first DA cycle, a 50-member ensemble of 4day WRF-Chem forecasts was performed from 00:00 UTC on 1 October to 23:00 UTC on 4 October 2014 using the perturbed ICs at 00:00 UTC on 1 October 2014, the corresponding perturbed LBCs and the emissions.Then a 50-member ensemble aerosol forecast at 00:00 UTC on 5 October 2014 was produced.

Assimilation experiments
Two DA experiments were performed.One was the pure assimilation of chemical ICs (hereafter expC), while the other was the joint adjustment of chemical ICs and source emissions (hereafter expJ).Both DA experiments had the same settings except for the emissions.They were conducted from 00:00 UTC on 5 October 2014 to 00:00 UTC on 16 October 2014.The assimilation cycle interval was 1 h.In the first DA cycle in expJ, the first 50 ensemble chemical fields were drawn from the WRF-Chem ensemble forecasts valid at 00:00 UTC on 5 October 2014, as described in Sect.4.1.Using the ensemble aerosol forecasts, the prior emission scaling factors λ p i,t at 23:00 UTC on 4 October 2014 were calculated.λ p i,t were used directly as λ f i,t for the first five assimilation cycles (after five assimilation cycles, the system has been initialized, all future scaling factors could be created using the persistence forecasting operator M SF ).Then, the state vector x f = [C f , λ f ] T was prepared.After that, the DA cycle started.
In expC, the first chemical fields were also drawn from the WRF-Chem ensemble forecasts valid at 00:00 UTC on 5 October 2014.Then, the state vector ] T was prepared and the DA cycle started.
During the WRF-Chem forecast step of the subsequent assimilation cycles for both experiments, the ICs for the chemical variables of each member were drawn from the updated chemical fields of the previous cycle.The aerosol LBCs of each member for all experiments were idealized profiles embedded within the WRF-Chem model.As for the meteorological ensemble fields, the LBCs were prepared in advance as depicted in Sect.4.1; the ICs of each member of the meteorological fields were drawn from the forecast meteorological fields of the previous cycle before re-centering with the GFS analysis because we do not do meteorological analysis: where π i is the ith member of the forecast meteorological fields of the previous cycle, π is the ensemble mean of the forecast meteorological fields of the previous cycle, π GFS is the meteorological field interpolated from the GFS analyses and π i new is the new meteorological field used as the IC in WRF-Chem in the next cycle.
As stated in the first paragraph in this section, the settings of expC were the same as those in expJ except for the emissions.In expJ, the ensemble anthropogenic emissions were generated by using emission scaling factors, while in expC the ensemble anthropogenic emissions were prepared by adding random noise, as stated in Sect.4.1.

Control experiment
The control experiment was conducted for the same period as the assimilation experiment and the simulation cycle period was 1 h, as in the assimilation experiment.The first initial chemical fields were extracted from the ensemble mean valid at 00:00 UTC on 5 October 2014.In the subsequent simulation process, the ICs for the chemical fields were from the previous cycle's 1 h forecast.The LBCs and ICs for the meteorological fields were updated by interpolating the GFS analyses.The emissions were the prescribed emissions E p t without any perturbation.

Results
Statistics for both expJ and expC were computed using the ensemble mean prior (background) and posterior (analysis) fields (average of the 50-member ensemble).The ensemble performances were first examined.Output from the first day of the cycling DA configurations was excluded from all verification statistics to allow the ensemble fields to "spin up" from the initial ensemble.
As the measurement coverage is an important factor that may determine the performance in DA, we primarily focused our attention on the results from three sub-regions with comparatively dense observational coverage (Fig. 1): the Beijing-Tianjin-Hebei region (JJJ; 12 stations for assimilation and 12 stations for verification); the Yangtze River delta (YRD; 24 stations for assimilation and 24 stations for verification); and the Pearl River delta (PRD; 9 stations for assimilation and 9 stations for verification).

Ensemble performance
It is important to assess the ensemble performance for an ensemble-based DA system.In a well-calibrated system, a comparison of the prior ensemble mean root-mean-square error (RMSE) with respect to the observations should equal the prior "total spread" (square root of the sum of ensemble variance and observation error variance) (Houtekamer et al., 2005).Figure 3 shows the time series for the prior ensemble mean RMSE and the total spread for PM 2.5 aggregated over all observations in the three sub-regions for expJ.It indicates that the magnitudes of both the total spread and the RMSE were influenced by the diurnal cycle and heavy air pollution.Almost all the total spreads were smaller than the RMSE, showing an insufficient spread of PM 2.5 ensemble forecasts, which is especially evident for heavy polluted period with much larger RMSEs.For expC, the characteristics of the prior ensemble mean RMSE and the total spread for PM 2.5 were very similar to that for the joint DA experiment.
The magnitudes of the ensemble spread of the emission scaling factors of the joint DA experiment were important for emission inversion.They were very stable throughout the ∼ 10-day experiment period, which indicates that M SF can generate stable artificial data to generate the ensemble emissions.For λ f PM 2.5 , they ranged from 0.25 to 1 in most model area.Figure 3d shows the area-averaged time series extracted from the ensemble spread of λ f PM 2.5 .It shows that the ensemble spread was stably distributed around 0.5, which indicates that the uncertainty of the ensemble emissions was about 50 %.

Impact on aerosol ICs
To evaluate quantitatively the impact of the ensemble assimilation system on the ICs, the mean errors (bias), RMSEs and correlation coefficient (CORR) of the assimilation exper-Atmos.Chem.Phys., 17, 4837-4855, 2017 www.atmos-chem-phys.net/17/4837/2017/iment and the control run were first analysed.These statistics were calculated against independent observations over all the analyses from 6 to 16 October 2014.Table 1 shows that the bias magnitudes of the control run were 15.9 and 20.6 µg m −3 for the YRD and the PRD, respectively, suggesting a significant overestimation of the WRF-Chem aerosol mass in these two sub-regions.However, a significant underestimation of the aerosol mass occurred in the JJJ region, where the model bias was −18.0 µg m −3 .The RMSEs of the control run were 81.6, 30.6 and 31.8 µg m −3 for the JJJ, YRD and PRD regions respectively.After assimilation, the statistics showed an apparent improvement and the magnitude of the bias and the RMSE decreased for both DA experiment.For expJ, both the maximum bias and the RMSE were obtained in the JJJ region and were −10.3 and 66.9 µg m −3 respectively.The CORR increased from 0.79, 0.60 and 0.62 to 0.83, 0.85 and 0.80 for the JJJ, YRD and PRD respectively.The statistics of expC were very similar to those of expJ.The bias and the RMSE in the JJJ region were −12.2 and 64.0 µg m −3 respectively.And the CORR were 0.85, 0.80 and 0.80 for the JJJ, YRD and PRD respectively.These results indicate that the initial PM 2.5 fields can be adjusted efficiently by the En-SRF.
It is interesting to note that expC has better RMSE and CORR than expJ but poor bias in JJJ and expC has better bias and RMSE than expJ but poor CORR in PRD.Maybe the small number of samples caused the uncertainties of the statics.However, the differences were very small.The analyses of both experiments were very similar.
Then the analysis increments (i.e.x a − x b ) were investigated to show the direct impact of PM 2.5 DA.They are determined by both the observation increments and the relative magnitudes of the forecast error and the observation error, based on Eq. ( 7).From Fig. 4a, e and f, the increments of both assimilation experiments were distributed around the observations as expected.However, the impact of assimilating PM 2.5 observations was not limited to the areas where observations were located; observation information was also transported to other areas through the WRF-Chem forecast.Besides, the ensemble forecasts also partly contributed to the spatial distribution of the PM 2.5 mass.Therefore, the spatial distributions of the PM 2.5 mass in both assimilation experi-ments were significantly different from the control run (see Fig. 4b-d), which suggests that assimilation PM 2.5 observations impact greatly the aerosol ICs.The PM 2.5 mass magnitudes of both assimilation experiments were smaller than that of the control run at the lowest model level in the YRD, PRD and central China.Conversely, positive differences (analysis minus control) were gained in the JJJ region and in northeast China.These indicated the reduction of the overestimation or underestimation of the WRF-Chem simulation over these regions with data assimilation.

Impact on emissions
To determine the impact of assimilating PM 2.5 observations on the chemical emissions, we analysed the area-averaged time series extracted from the forecast emission scaling factors, the optimized emission scaling factors, the prior emissions and the optimized emissions.Figure 5 shows that λ f PM 2.5 were changed along with λ a PM 2.5 .This indicates that observation information ingested from the previous observations was incorporated through the usage of the time smooth operator.
Figure 5 also shows that although the prior emissions E p PM 2.5 had no diurnal variation when the experiments were designed, the optimized PM 2.5 scaling factor, λ a PM 2.5 , showed an obvious variation with time, as did the optimized unspeciated primary sources of PM 2.5 , E a PM 2.5 .Moreover, the values of λ a PM 2.5 were < 1 at almost all times in the YRD and PRD, which resulted in the analysed emission E a PM 2.5 being lower than the prior PM 2.5 emissions E p PM 2.5 .In the YRD, the prior E p PM 2.5 was about 0.127 µg m −2 s −1 over all hours.After assimilation, the time-averaged optimized E a PM 2.5 decreased to 0.107 µg m −2 s −1 , about 15.6 % lower than the prior value.In the PRD, the prior E p PM 2.5 was about 0.10 µg m −2 s −1 .The time-averaged optimized E a PM 2.5 decreased to 0.066 µg m −2 s −1 , leading to a decrease of 35.0 %.However, larger values for the optimized E a PM 2.5 were obtained in the JJJ region in three periods, from 16:00 UTC on 6 October to 00:00 UTC on 8 October, from 16:00 UTC on 9 October to 00:00 UTC on 10 October and from 16:00 UTC on 13 October to 00:00 UTC on 15 October as a result of the increased optimized scaling factor λ a PM 2.5 .This may have been caused by the burning of crop residues during harvesting in this region (Li et al., 2016), which was not taken into account in the prior emissions.However, the PM 2.5 measurements network was still spatially sparse and heterogeneous in this work.Almost all measurements were located in the city and no data are available in the rural.Meanwhile, the crop residues burning always occur in the rural region.Therefore, the PM 2.5 measurements network can only capture the burning information a few hours later.Hence, although the system is able to detect the emission changes caused by burning events, the time that the system started to show increased scaling factors might not be accurate enough (it may shift a few hours later).A Kalman smoother may have been a better system to solve this problem.
The NO, SO 2 and NH 3 emissions were all adjusted to some extent by our DA approach (see Fig. 6).The NO emissions increased by 41.3, 43.7 and 20.3 % in the JJJ, YRD and PRD regions respectively.The SO 2 emissions increased by 16.3, 10.0 and 18.3 % and the NH 3 emissions increased by 16.7, 7.8 and 7.5 % in the JJJ, YRD and PRD regions respectively.
Figure 7 shows the spatial distribution of the timeaveraged scaling factors λ a PM 2.5 at the lowest model level over all hours from 6 to 16 October 2014, since the emissions at higher levels were so small that the impact of assimilating PM 2.5 observations was negligible.Figure 8 shows the distribution of E p PM 2.5 and the time-averaged differences between the ensemble mean of the assimilation and the prior values.
These patterns are consistent with those in Fig. 5. Negative differences were obtained in most areas of the YRD and PRD, indicating that the PM 2.5 DA primarily decreased  the PM 2.5 emissions.Conversely, positive differences were obtained in South Hebei, North Henan and Southeast Shanxi provinces, indicating that DA increased the PM 2.5 emissions.
As the economy in China has developed, the spatiotemporal distribution of emissions has changed as a result of changes in energy consumption, the structure of the energy market and advances in technology.Therefore although this inventory of emissions may have correctly described anthropogenic emissions in 2006 when it was constructed, it is not representative of the anthropogenic emissions in 2014.Theoretically, the assimilated emissions should reduce the uncertainty in the prior emissions as a result of the application of observations.Different from the reports of standard national emission inventories by governments in USA, Europe and other countries, the rapid economic development and complexity of emission sources in China have led to large uncertainties in the current emission inventories, even for the latest version.Thus it is impossible for us to conduct a direct evaluation on emissions.
Although we had no direct emission observations to evaluate the analysing emissions, which was a challenging to many emission inversion research teams (e.g.Tang et al., 2011;Miyazaki et al., 2012;Ding et al., 2015;McLinden et al., 2016), the improvement of emissions can be verified in terms of two aspects: the diurnal variation and the location of increased emissions.The diurnal variation in the assimilated emissions verified this statement to some extent.Especially in the PRD and YRD, E a PM 2.5 in the daytime were always larger than those in the night, which agreed well with Olivier et al. (2003), the WRAP (2006) and Wang et al. (2010).In addition, the locations of the larger values for the optimized E a PM 2.5 in the JJJ region were in good agreement with the locations of crop residue burning traced by the environmental satellite of China.There were 10, 231, 37 and 3 crop residue burning spots from 5 to 11 October 2014 and 7, 20, 5 and 21 from 12 to 18 October 2014 in Hebei, Henan, Shandong and Shanxi provinces, respectively (Weekly Crop Residue Burning Monitoring Report traced by Environmental Satellite, 2015a, b).
However, the analysis emissions are only a mathematical optimum.They are influenced greatly by the model errors and the observation errors.In addition, only surface PM 2.5 observations were applied in this work, which may lack abundant constraint on the sources of the secondary aerosol precursors.More observations are needed to obtain reliable emissions for the sources of the gas-phase precursors.

Verification of aerosol forecasting
For the assimilation experiment, 48 h forecasts were performed at each 00:00 UTC from 6 to 16 October 2014 with the hourly forecast output for both assimilation experiments.For the verification forecasting experiment for expJ (hereafter fcJ), the ensemble mean of the analysed ICs and emissions of expJ were used in this longer-range model forecast.For the verification forecasting experiment for expC (hereafter fcC), the ensemble mean of the analysed ICs of expC and the prescribed anthropogenic emissions were used.
In order to get a more visualized picture of the impact of DA for both assimilation experiments, time series of the hourly PM 2.5 extracted from the analysis (AN), the control run (CT) and the hourly output of 48 h forecast (fc24 for the first-day forecast and fc48 for the second-day forecast) were compared with the observations (OBS) for the megacities Beijing, Shanghai and Guangzhou respectively (Fig. 9).As expected, the time series of the analysis (also the background) were consistent with the observations.The control run showed large deviations from the observations, especially in Shanghai and Guangzhou.Benefit from DA on both the first-day and the second-day forecasts can be clearly seen.
The bias and the RMSE of the surface PM 2.5 forecasts as a function of forecast range were then calculated against the independent observations for the three sub-regions (Fig. 10).Both the bias and the RMSEs of the control run were characterized by the diurnal cycle in the YRD and PRD.The largest errors were seen at 21:00 UTC in the YRD (about 29 µg m −3 for bias and 37 µg m −3 for RMSEs) and at 23:00 UTC in the PRD (about 36 µg m −3 for bias and 41 µg m −3 for RM-SEs), likely indicating significant systematic forecast errors at these times.From 03:00 to 09:00 UTC, the bias (about 1 µg m −3 in the YRD and −5 µg m −3 in the PRD) and the RMSE values (about 14 µg m −3 in the YRD and 16 µg m −3 in the PRD) were much smaller than at other times in both the YRD and PRD, showing that WRF-Chem performed well during this period.However, in the JJJ region, the bias (about −20 µg m −3 ) and the RMSEs (about 50 µg m −3 ) were always large as a result of a heavy pollution event.After assimilation, both the magnitude of the bias and the RMSEs decreased sharply.Especially in YRD and PRD, most bias ranged from −5 to 5 µg m −3 and most RMSEs ranged from 11 to 14 µg m −3 , further indicating that DA greatly affected the ICs.
The improvements in the surface PM 2.5 forecasts by the joint adjustment of the ICs and emissions were very large in the YRD and PRD for expJ.Large reduction of the magnitude of the bias and the RMSEs due to assimilation can be seen for almost the entire 48 h forecast range.From 10 to 23 h and from 34 to 47 h, in particular, the relative reduction in RMSE was about 37.5 %.However, the DA impact was much smaller for 3 to 9 h forecast ranges, which are daytime of the first-day forecast.In addition, the improvements were nearly negligible in PRD from 27 to 33 h, the daytime of the second-day forecast, suggesting that the benefit gained from adjusting the ICs decreased progressively and eventually disappeared with model integration.The performance actually deteriorated in YRD during the same time.One of the possible reasons was that chemical model performed sufficiently well during daytime when the boundary layer was unstable and therefore the further improvement was more difficult.There were also always large errors during the night when the boundary layer was stable, so that large improvements could be obtained.The other possible reason can be attributed to the a priori constant emissions.The differences between the optimized PM 2.5 emissions and the prior emissions were comparatively small during the day, but the optimized PM 2.5 emissions were much smaller than the a priori emissions during the night; thus, the control run performed worse during the night and well during the day.Given the a priori variable emissions provided, the control run will per- form better during the night.Nevertheless, attributed greatly to the large adjustment of chemical emissions, substantial improvements were still achieved from 34 to 47 h.These results revealed that joint adjustment of the ICs and emissions can improve surface PM 2.5 forecasts up to 48 h in the YRD and PRD.
As for expC, it seemed that large improvements in the surface PM 2.5 forecasts were gained through the adjustment of the ICs in PRD from 10 to 23 h and from 34 to 47 h.Large reduction of the magnitude of the bias and the RMSEs due to assimilation can be seen during this period.The relative reduction in RMSE ranged from 25 to 37.5 %.However, the forecasts deviated significantly from the observations for 3 to 9 h and 27 to 33 h forecast ranges.One reason for this may be that the adjustment of the ICs decreased the analysis field too much on the whole since the WRF-Chem forecast aerosol mass was systematically overestimated in PRD (see Figs. 4,9f and 10e).This aerosol mass overestimation might be also due to the possibly overestimated emissions in some time periods (not all day long) that are not corrected in the simulation.So the over-adjusted ICs compensated for the unadjusted emissions in some periods but also led to the negative biases for the periods when emissions were not overestimated or underestimated.The other factor was the diurnal variation.It is very clear that PM 2.5 mass gradually decreased with time from 00:00 to 00:08 UTC and then obtained the smallest value.After that it increased with time from 00:09 to 00:23 UTC and obtained the largest value at about 00:00 UTC.Both reasons led to the systematically underestimation of PM 2.5 mass of fcC from 3 to 9 h and from 27 to 33 h, though the aerosol ICs may have been very close to the observations.Therefore, both the magnitude of the bias and the RMSEs of the fcC were larger than those of the control run.In addition, PM 2.5 forecasts of the fcC benefited much from the diurnal variation and the adjustment of the ICs from 10 to 23 h and from 34 to 47 h.As a consequence, the magnitude of the corresponding bias and the RMSEs of the fcC were smaller than those of the control run.Similar statistical characteristics were also gained in YRD.However, the improvements were comparatively small from 10 to 23 h and from 34 to 47 h.However, the performance of fcJ was much better than that of fcC during the night in PRD and YRD, while in the daytime the improvement of expJ seems to be not so big or even negligible.This could be attributed to the emissions since the ICs of both forecasts were very similar.In the forecast experiment of expC, the emissions were the default monthly anthropogenic emissions, while in the forecast experiment of expJ the assimilated emissions were much smaller than the default anthropogenic emissions almost every night in both regions, indicating that the prior emission uncertainties might be the dominating reasons for biases between observed and model-simulated concentrations in these cases.In the daytime in PRD, the assimilated emissions were a little smaller than the default anthropogenic emissions.In the daytime in YRD, the assimilated emissions were a little larger than the default anthropogenic emissions for most time (see Fig. 5).Those changes between assimilated emissions and prior emissions in the daytime are not as significant as in the nighttime.
Both DA systems did not perform as well in the JJJ region as in the YRD and PRD.Relatively smaller improvements were achieved in the first 24 h forecast but then no improvements were achieved afterwards in JJJ.One possible reason for this result may be systematic errors due to chemistry mechanism in WRF-Chem.The sources of the aerosols are so complex that our knowledge of their formation mechanisms is far from clear and large uncertainties still exist in the model simulations.Chemical transport models have a tendency to underestimate PM concentrations, especially during episodes of heavy pollution (Denby et al., 2007) due to some missing reactions (Wang et al., 2014;Zhang et al., 2015;Zheng et al., 2015;Chen et al., 2016).Another reason can be attributed to the forecast meteorological fields.There were still large uncertainties, especially when the boundary layer was stable and the wind speed was very small during episodes of heavy pollution.As a result, a large bias may be obtained in forecasts of heavy pollution given the ICs and emission inventories achieved from the joint assimilation.Another reason may be the sparse coverage of measurements.There were only 12 sites in the JJJ region (Fig. 1) and the measurement coverage was much sparser than in the YRD or PRD.

Summary and discussion
The EnSRF algorithm was extended to adjust the chemical ICs and the primary and precursor emissions to improve forecasts for surface PM 2.5 .This system was applied to assimilate hourly surface PM 2.5 measurements from 5 to 16 October 2014 over China.To evaluate the effectiveness of DA, 48 h forecasts were performed using the optimized ICs and emissions, together with a control experiment with-out DA.The experiment of pure assimilation chemical ICs and the corresponding 48 h forecasts experiment were also performed for comparison.The results indicated that the forecasts with the optimized ICs and emissions performed much better than the control simulations.Large improvements were achieved for almost all the 48 h forecasts, particularly in the YRD and PRD.Although it did show some improvements in the first 24 h, there is no difference between the control run and the forecasts in the JJJ region afterwards, which may be attributed to the sparse measurement coverage and the deficiencies in the model system for forecasting heavy pollution.Compared to the forecasts with only the optimized ICs, the forecasts with the joint adjustment were always much better during the night in the PRD and YRD regions.However, they were very similar during daytime in both regions.In the JJJ region, they performed similarly for almost the entire 48 h forecast range.
There are still some limitations in this study.Firstly, we use the default monthly anthropogenic emissions as the prior emissions and no time variation was added to keep objective, since no resolution of temporal allocations at shorter but critical (e.g.day-of-week, diurnal) scales is available.As shown in earlier work, the constant emissions will worsen the chemical forecasts (de Meij et al., 2006;Wang et al., 2010).For the joint DA system itself, it cannot benefit from the constant prior anthropogenic emissions.The normalized RMSE in Fig. 10g decreased due to the poor forecasts of control run.The control run will perform better when variable emissions within the day are allowed, especially during the night.As a result, the relative reduction in RMSE could not be so large during the night.Secondly, no correlations between emissions variables were considered when perturbing the emissions, which led to the reduction of the correlations between the variables.Thus, the chemical forecast will deviate from the truth to some degree.Fortunately, the perturbed emissions were only used in the initialization and spin-up experiment and expC.Therefore, there were no impacts on expJ and the control run except for expC.Thirdly, E EC and E ORG are not perturbed in expJ.However, as stated in Sect.2.3.2, the ensemble spread of OC 1 and OC 2 can be kept at a certain level.As a result, OC 1 and OC 2 contributed to the PM 2.5 assimilation in expJ, which suggests that the influence of not perturbing E EC and E ORG could be negligible.However, because of the too-small magnitudes of BC 1 and BC 2 , the differences (assimilation minus control) of BC 1 and BC 2 were close to zero.Fourthly, the experiment (expE) where only emissions were assimilated was not included here.However, it was still worth simultaneously assimilating the chemical ICs and emission.For one thing, in expE the chemical concentrations can be updated by the WRF-Chem model simulations with the assimilated emissions as the initial field in each DA cycle.That means that the 50-member ensemble forecasts were performed twice, which was time consuming.Additionally, better concentration analysis could be obtained in expJ due to the simultaneous assimilation of ICs and emis-sions, while in expE there may be larger uncertainties for the updated chemical concentrations through WRF-Chem due to the deficiency of chemistries and the uncertainties of the ICs.This will lead to larger uncertainties for the emission inversion.Also the improvement of PM 2.5 forecasts will be limited due to the comparatively poor chemical ICs.
This study represents the first step in the simultaneous optimization of chemical ICs and emissions and only surface PM 2.5 measurements were assimilated.In future work, gasphase observations of SO 2 , NO 2 and CO will be used to further improve the performance of this DA system.

Figure 1 .
Figure1.Locations of 77 PM 2.5 assimilation observation stations (black dot) and the 77 independent observation stations (red triangle) in the model domain.The three coloured boxes mark subregions with relatively dense coverage for the Beijing-Tianjin-Hebei region (JJJ; 12 assimilation stations and 12 independent stations; red box), the Yangtze River delta (YRD; 24 assimilation stations and 24 independent stations; blue box) and the Pearl River delta (PRD; 9 assimilation stations and 9 independent stations; green box).

Figure 2 .
Figure 2. (a) Framework of M SF and (b) flow chart of the data assimilation system that simultaneously optimizes the chemical initial conditions and emissions.

Figure
Figure2bshows the workflow of the DA system.The steps in this workflow are as follows.1.The persistence forecasting operator M SF is applied to forecast the background fields of the emission scaling factors λ f PM 2.5 , λ f SO 2 , λ f NO and λ f NH 3 .The forecast chemical fields of P 25 , SO 2 , NO and NH 3 of the previous assimilation cycle are used to create the prior emission scaling factors λ p PM 2.5 , λ p SO 2 , λ p NO and λ p NH 3 .The background scaling factors are then generated using Eq.(5).

Figure 3 .
Figure 3.Time series of prior ensemble mean RMSE and total spread for PM 2.5 concentrations aggregated over all observations over the three sub-regions: (a) Beijing-Tianjin-Hebei region, (b) Yangtze River delta and (c) Pearl River delta.(d) Time series of the area mean ensemble spread for λ PM 2.5 over the three sub-regions.

Figure 4 .
Figure 4. Spatial distribution of the PM 2.5 mass (µg m −3 ) of the (a) observations, (b) simulation of the control run, (c) analysis of expJ, (d) analysis of expC, (e) increments of expJ and (f) increments of expC at the lowest model level averaged over all hours from 6 to 16 October 2014.

Figure 5 .
Figure 5. Hourly area-averaged time series of emission scaling factors (black) extracted from the ensemble mean of the analysed λ a PM 2.5

Figure 6 .
Figure 6.Hourly area-averaged time series of emission scaling factors extracted from the ensemble mean of the analysed (a) λ a NO , (b) λ a SO 2

Figure 7 .
Figure 7. Spatial distribution of λ PM 2.5 at the lowest model level averaged over all hours from 6 to 16 October 2014.

Figure 8 .
Figure 8. Spatial distribution of (a) the prior unspeciated primary sources of PM 2.5 (µg m −2 s −1 ) and (b) the time-averaged differences between the ensemble mean analysis and the prior values (µg m −2 s −1 ) at the lowest model level averaged over all hours from 6 to 16 October 2014.

Figure 9 .
Figure 9.Time series of the hourly PM 2.5 obtained from observations (circle), analysis (blue line), control run (black line) and hourly output of 48 h forecast in three megacities: (a) Beijing, (c) Shanghai and (e) Guangzhou in expJ and (b) Beijing, (d) Shanghai and (f) Guangzhou in expC.See text in Sect.5.4.

Figure 10 .
Figure 10.Bias of surface PM 2.5 as a function of forecast range calculated against all the independent observations over the three sub-regions shown in Fig. 1: (a) Beijing-Tianjin-Hebei region, (c) Yangtze River delta, (e) Pearl River delta and RMSE over (b) Beijing-Tianjin-Hebei region, (d) Yangtze River delta and (f) Pearl River delta.(g) Normalized RMSE (assimilation divided by control) for expJ and (h) normalized RMSE for expC.The 48 h forecasts were performed at each 00:00 UTC from 6 to 16 October 2014 and the statistics were computed from 6 to 16 October.

Table 1 .
Comparison of the surface PM 2.5 mass concentrations from the control and assimilation experiments to observations over all analysis times from 6 to 16 October 2014.