Integrated emission inventory and modeling to assess distribution of particulate matter mass and black carbon composition in Southeast Asia

. This is part of a research study addressing the potential co-beneﬁts associated with selected black carbon (BC) emission reduction measures on mitigation of air pollution and climate forcing in Southeast Asia (SEA). This paper presents details of emission inventory (EI) results and WRF–CHIMERE model performance evaluation. The SEA regional emissions for 2007 were updated with our EI re-sults for Indonesia, Thailand, and Cambodia and used for the model input. WRF–CHIMERE-simulated 2007 PM 10 , PM 2 . 5 , and BC over the SEA domain (0 . 25 ◦ × 0 . 25 ◦ ) and the results were evaluated against the available meteorology and air quality monitoring data in the domain. WRF hourly simulation results were evaluated using the observed data at eight international airport stations in ﬁve SEA countries and showed a satisfactory performance. WRF–CHIMERE results for PM 10 and PM 2 . 5 showed strong seasonal inﬂu-ence of biomass open burning while the BC distribution showed the inﬂuence of urban activities in big SEA cities. Daily average PM 10 constructed from the hourly concentrations were obtained from the automatic monitoring stations in three large SEA cities, i.e., Bangkok, Kuala Lumpur, and Surabaya, for model evaluation. The daily observed PM 2 . 5 and BC concentrations obtained from the Improving Air Quality in Asian Developing Countries (AIRPET) project for four cities (i.e., Bangkok, Hanoi, Bandung, and Manila) were also used for model evaluation. In addition, hourly BC concentrations were taken from the measurement results of the Asian Paciﬁc Network (APN) project at a suburban site in Bangkok. The modeled PM 10 and BC satisfactorily met all suggested statistical criteria for PM evaluation. The modeled PM 2 . 5 / PM 10 ratios estimated for four AIRPET sites ranged between 0.47 and 0.59, lower than observed values of 0.6– 0.83. Better agreement was found for BC / PM 2 . 5 ratios with the modeled values of 0.05–0.33 as compared to the observation values of 0.05–0.28. AODEM (extended aerosol optical depth module) was used to calculate the total columnar aerosol optical depth (AOD) and BC AOD up to the top of the domain at 500 hPa ( ∼ 5500 m), which did not include the free-tropospheric long-range transport of the pollution. The model AOD results calculated using the internal mixing assumption were evaluated against the observed AOD by both AERONET and MODIS satellite in 10 countries in the domain. Our model results showed that the BC AOD contributed 7.5–12 % of the total AOD, which was in the same range reported by other studies for places with intensive emissions. The results of this paper are used to calculate the regional aerosol direct radiative forcing under different emission reduction scenarios to explore potential co-beneﬁts for air quality improvement, reduction in the number of premature deaths, and climate forcing mitigation in SEA in 2030 (Permadi et al., 2017a).


Introduction
Southeast Asia (SEA), with a large population and fastgrowing economy, is an important contributor to the emissions of air pollution and greenhouse gases in Asia (Streets et al., 2003;Zhang et al., 2009).The emissions of anthro-Published by Copernicus Publications on behalf of the European Geosciences Union.pogenic aerosol from Asia, and specifically from SEA, are expected to rise in the near future due to the increase in the energy demand and rapid industrialization (Lawrence and Lelieveld, 2010;Ohara et al., 2007).High levels of fine particulate matter (PM with diameter less than 2.5 µm or PM 2.5 ), the most detrimental air pollutant affecting health (Janssen et al., 2011;WHO, 2012), are observed in many developing Asian cities, with the annual average often exceeding the WHO guideline of 10 µg m −3 by many times (Kim Oanh et al., 2006;Hopke et al., 2008).Components of PM, e.g., PM 2.5 , PM 10 (PM with diameter less than 10 µm), black carbon (BC), and organic carbon (OC), have been monitored in some Asian cities and the results, although fragmented, showed considerably high levels (Kondo et al., 2009;Kim Oanh et al., 2006;Hopke et al., 2008).The fine particles and their precursors are also involved in long-range transport (LRT), hence causing regional phenomena such as atmospheric brown clouds (ABC) (UNEP and C 4 , 2002; Ramanathan et.al., 2001) and affecting the climate (UNEP- WMO, 2011).Globally, measures aiming to reduce emissions of BC (and co-emitting pollutants) were shown to reduce the number of premature deaths and slow down the near-future temperature increase, in addition to other benefits to be gained in Asia, where current emissions are high (UNEP- WMO, 2011;Shindell et al., 2012).
To comprehensively assess the co-benefits of emission reduction measures on a regional scale, finer temporal and spatial resolutions of the modeling results are required.Several studies have been conducted for various Asian domains using a regional climate model with chemistry (Nair et al., 2012) or chemical transport models (CTMs) with an additional aerosol optical module.Most of the Asian regional modeling studies mainly focused on the domains of East (Han et al., 2011;Park et al., 2011;Chen et al., 2013;Zhang et al., 2016), South (Goto et al., 2011), and continental East and Southeast Asia (Lin et al., 2014).These studies also highlighted several challenges for models to reproduce the ground-observed PM due to inaccurate emission inventory (EI), simulated meteorological fields, and the extent of model representations (e.g., secondary organic aerosol formation, gas-particle partitioning, dry and wet deposition).
There are currently no detailed modeling studies conducted for the SEA domain, especially maritime SEA, which includes Indonesia with its large biomass open burning (OB) emissions.For such a modeling effort, first a reasonably accurate regional EI database should be prepared to generate input data.Several global and regional EI databases are available which also cover the SEA domain.These datasets have been developed using the activity data taken from several international data sources (Zhang et al., 2009;EC-JRC/PBL, 2010) or based on the results of large-scale energy models (Streets et al., 2003).Efforts therefore should be put forward to update the SEA EI databases in order to generate emission input data for SEA regional modeling studies.
Our research used integrated EI and modeling tools to provide the spatial and seasonal distributions of aerosol species (PM 10 , PM 2.5 , and BC) in SEA for 2007 and the co-benefits (for air quality, health, and climate forcing) of selected emission reduction measures for 2030.This paper presents the SEA emissions for the base year of 2007 and the WRF-CHIMERE performance evaluation.CHIMERE (Menut et al., 2013, and references therein) was used to simulate three-dimensional (3-D) aerosol concentrations in the domain using the meteorological fields generated by the Weather Research and Forecasting (WRF) model (Michalakes et al., 2004).The model results were evaluated using available ground-based measurements of PM 10 , PM 2.5 , and BC in several SEA cities.The extended aerosol optical depth (AOD) module (AODEM), detailed in Landi and Curci (2011), was applied to calculate the total columnar AOD and BC AOD.The modeled total AOD was evaluated using the observed AOD from both ground-based Aerosol Robotic Network (AERONET) and the Moderate Resolution Imaging Spectroradiometer (MODIS) satellite product.The results are used in the follow-up study, which investigated the potential co-benefits of various emission reduction measures implemented in Indonesia and Thailand for air quality improvement, reduction of premature deaths, and climate forcing mitigation in 2030 (Permadi et al., 2017a).

Emission inventory and emission input data
The emissions from major anthropogenic sources in Indonesia, Thailand, and Cambodia were developed following the EI framework given in the Atmospheric Brown Cloud Emission Inventory Manual (ABC EIM) (Shrestha et al., 2013) using the activity data summarized in Table 1.A detailed EI methodology for Indonesia was presented in Permadi et al. (2017b).
The biomass OB categories considered in this study included crop residue open burning (CROB) and forest fires (aboveground forest fires and peatland fires).The CROB emissions (aerosol and trace gases) for Thailand for 2007 were taken from Kanabkaew and Kim Oanh (2011), and both CROB and aboveground forest fire emissions for Indonesia were from Permadi and Kim Oanh (2013), also for 2007.For Cambodia, CROB emissions for 2007 were also included in the inventory (Permadi, 2013) but forest fire emissions were taken from the Global Fire Emission Database v3 (GFEDv3) database (van der Werf et al., 2010).CROB emissions for Thailand, Indonesia, and Cambodia were estimated from crop production statistics, residue to crop ratio, dry matter to crop residue ratio, fraction of biomass burned in the field, combustion efficiency, and emission factors.The emission results were higher than other databases that used the MODIS product (e.g., GFEDv3) because the Atmos.Chem.Phys., 18, 2725Phys., 18, -2747Phys., 18, , 2018 www.atmos-chem-phys.net/18/2725/2018/satellite may not efficiently capture short-lived, small-sized, sporadic CROB fires (Permadi and Kim Oanh, 2013).For other countries in the domain, the emissions from the aboveground forest fires were from Song et al. (2009) while those from CROB were from GFEDv3 for the base year of 2007.The 2007 emissions from peatland fires of all countries in the SEA domain were also taken from GFEDv3 (van der Werf et al., 2010).The GFEDv3 database was developed using a combination of MODIS burned area and active fire prod-ucts, which is believed to better detect the peatland fires than the MODIS burned area product MCD45A1 alone for forest fire detection (Shi et al., 2014).
Biogenic emissions were calculated online by the CHIMERE model using the methodology described in Simpson et al. (1999) that considers seasons and vegetation cover types taken from the Global Land Cover Facility (GLCF) (http://glcf.umd.edu) with a resolution of 1 km × 1 km.CHIMERE incorporates the Model of Emissions of www.atmos-chem-phys.net/18/2725/2018/Atmos.Chem.Phys., 18, 2725-2747, 2018 Gases and Aerosol from Nature (MEGAN) module (Guenther et al., 1995) for estimation of volatile organic compounds (VOCs) from natural vegetation and NO emissions from soil.We did not include the emissions from international shipping in this simulation, but "inland waterway" sources were included in the inventories for the three countries.Other sources of PM such as unpaved road and windblown dust emissions were also not included in this study.Spatial distributions were made based on source activity data or relevant proxies collected for the administrative boundaries.For example, for Indonesia, the emissions were disaggregated at the district level (300 districts), while for Thailand and Cambodia emissions were presented for the provincial level (76 and 24 provinces, respectively).Further, the emissions were gridded into 0.25 • × 0.25 • using the Geographic Information System (GIS) tool.For other countries in SEA, the emissions of SO 2 , NO x , CO, VOC, PM 10 , PM 2.5 , BC, and OC were taken from the available online gridded EI databases (grid size of 0.5 • , i.e., ∼ 50 km) compiled by the Center for Global and Regional Environmental Research (CGRER) (Zhang et al., 2009).The gridded CH 4 and NH 3 emissions that were not included in CGRER were taken from the global Emission Database for Global Atmospheric Research (EDGAR) (EC-JRC/PBL, 2010), with a grid resolution of 10 km×10 km.Emissions from all considered sources were compiled for the base year of 2007 and were gridded to 0.25 • × 0.25 • (∼ 30 km × 30 km) for the modeling input.Monthly based activity data were obtained whenever available to construct monthly emissions, but when the data were not available relevant proxies were used.Specific country hourly variations of emissions were extracted from the available studies in SEA.For Indonesia, detailed methodology and data sources to construct monthly and hourly emissions of major anthropogenic sources are detailed in Permadi et al. (2017b).For aboveground forest vegetation, crop residue, and municipal solid waste OB emissions, the monthly and hourly profiles were adopted from Permadi and Kim Oanh (2013).For Thailand, the hourly profiles for power plants and industry were obtained from Pham et al. (2008) while those for other major anthropogenic sources were taken from Vongmahadlek et al. (2009).The survey-based information on hourly profiles reported in Kanabkaew and Kim Oanh (2011) was used for CROB emissions.For other countries, relevant data from regional EI from Streets et al. (2003) were utilized.The hourly emission input for the domain were prepared using a Fortran program developed at the Asian Institute of Technology (AIT) for this purpose.
The lumping of non-methane volatile organic compounds (NMVOCs) emissions into the model species was done according to the MELCHIOR mechanism (Middleton et al., 1990).The aggregation produced the emissions of 33 species, including trace gases and aerosol in units of mol cm −2 s −1 .Aerosol fluxes were also converted to the "molecule-like" units in the emission input data using a fic-tive molar mass equal to 100 g mol −1 (Bessagnet et al., 2004).

Modeling domain
The choice of domain size and resolution affects the balance between the boundary and internal modeling forcing in the simulated concentrations (Seth and Giorgi, 1998).For this study, it is important that the defined domain allows the transport of air pollutants by the monsoon circulation across SEA.Therefore, we set the domain to cover as much as possible the major upwind emission sources and to capture meteorological processes in the region of interest.
The SEA domain horizontally covered nine countries of the Association of Southeast Asian Nations (ASEAN) and three provinces of Southern China (Fig. S1 in the Supplement).The WRF domain extended from central Myanmar to northern Australia, covering 230×200 grids.The CHIMERE domain extended from Southern China (24 • N, 95 • E) to eastern parts of Indonesia (9 • S, 137 • E), consisting of 169 × 133 grids.The grid resolution of the WRF and CHIMERE was set to be the same, 0.25 • × 0.25 • (∼30 km ×30 km).

WRF and CHIMERE model configuration
WRF version 3.3 was used with lateral boundaries and initial meteorological conditions taken from the National Centers for Environmental Prediction (NCEP) final (FNL) global analyses that are available at 1 • × 1 • grid resolution every 6 h (http://rda.ucar.edu/datasets/ds083.2/).The WRF Preprocessing System (WPS) of geographical input data (i.e., land use, vegetation index, soil type, and albedo) was also obtained from the NCEP database.In total, 28 vertical levels were simulated, with the lowest level having the physical height of about 38 m.Analysis nudging was performed in the planetary boundary layer (PBL) and other layers for wind components (u and v), temperature (T ), and relative humidity (RH).Nudging coefficients were set for all parameters at 0.00005 s −1 .The time interval between analyses was set at 360 min, which is equivalent to the 6-hourly boundary input data used in our study.This analysis nudging was performed because it is suitable for coarse-resolution simulations (30 km × 30 km) to drive regional air quality models since it can improve the accuracy for the downscaled and/or nested fields (Dudhia, 2012;Bowden et al., 2012).Note that, due to the insufficiency of spatially distributed meteorological observations in the domain, the observation nudging was not performed.
In the WRF simulation, the following physics options were used: simple ice microphysics, unified Noah Land Surface Model for the land-surface scheme, Rapid Radiative Transfer Model (RRTM) and Dudhia schemes for long-and shortwave radiation, a PBL parameterization scheme from Yonsei University, and a Kain-Fritsch scheme with deep and shallow convection option for cumulus parameterization.These Atmos.Chem.Phys., 18, 2725Phys., 18, -2747Phys., 18, , 2018 www.atmos-chem-phys.net/18/2725/2018/schemes were selected as they are suitable for mesoscale grid size and have been used in previous studies (Jankov et al., 2005;Osuri et al., 2012).This study used CHIMERE version 2008c with the MEL-CHIOR 2 chemical mechanism that was adapted from the original European Monitoring and Evaluation Programme (EMEP) and consisted of around 120 reactions and 40 chemical species.The vertical profiles of updated reaction rates in MELCHIOR 2 have been developed using tabulated clearsky photolysis rates taken from the Tropospheric Ultraviolet and Visible (TUV) model (EC4MACS, 2012).This version of CHIMERE has an aerosol module which consists of the total primary PM emissions (BC, OC, and other primary particles) and secondary inorganic PM species, such as nitrate, sulfate, ammonium, and secondary organic aerosol (Bessagnet et al., 2004).CHIMERE applies the sectional approach to discretize particle size distribution into a finite number of bins.The considered particle size range was from 40 nm to 10 µm, which were distributed into eight bins (0.039, 0.078, 0.156, 0.312, 0.625, 1.25, 2.5, 5, 10 µm) (Pere et al., 2011).Most of aerosol-related dynamic processes, such as condensation, coagulation, wet and dry deposition, adsorption, and scavenging, are incorporated in the model (http://www.lmd.polytechnique.fr/chimere/).This version of CHIMERE only allows tropospheric simulations below 200 hPa (∼ 12 km).
We used eight vertical layers in this study, from sigma level 0.999 (∼ 20 m) to ∼ 0.5 (∼ 5500 m), equivalent to the 500 hPa pressure level.This upper limit was selected based on a suggestion that in the modeling of anthropogenic pollution, extending the vertical dimension beyond 500 hPa would not substantially change the modeled aerosol concentrations for the ground level (Menut et al., 2013).However, it is recognized that the top of the domain at 500 hPa may not be able to include the free-tropospheric LRT of the pollution and it brings in uncertainty to the total column AOD results.Note that in the CHIMERE version used in this study, the photolysis rates are calculated under clear-sky conditions as a function of height using the TUV model (Madronich et al., 1998) and photolysis rates are estimated only up to 9000 m.However, with the present formulation for cloud-radiation photolysis, assuming that the model domain is below the cloud, cloud albedo was not taken into account.Monthly mean boundary conditions of gases and aerosol were taken from the simulation results for a period of 1998-2002 by the Laboratoire de Météorologie Dynamique (LMDZ) -Interaction avec la Chimie et les Aérosols (INCA) (Schulz et al., 2006), which are available at the CHIMERE website.To assess the effects of the somewhat aged boundary conditions compared to the model year, a comparison between the monthly average concentrations at the boundaries between 2007 and 1998-2002 was made which showed that the difference among the datasets differed by 0.98-1.23 considering the ratios for aerosol and PM precursor gases (i.e., BC, OC, NO 2 , CO, SO 2 , C 2 H 4 , CH 3 CHO, and NH 3 ) between the two datasets.This implies that basically the two datasets were almost similar.Thus, the impacts of the aged boundary conditions on the simulation are expected but with a small magnitude.Initial conditions of gases and aerosol concentrations in every grid were interpolated from the outputs of the global CTM of the LMDZ-INCA simulation.A 1-year simulation (1 January-31 December 2007) was performed by both WRF and CHIMERE with a spinup period of 1 week prior to the main simulation period.

Aerosol optical depth calculation
A standalone post-processing tool, known as AODEM, developed by Istituto di Scienze dell'Atmosfera e del Clima -Consiglio Nazionale delle Ricerche (ISAC-CNR) of Italy (Landi and Curci, 2011) was used to calculate optical parameters of AOD (extinction coefficients and single scattering albedo) using the 3-D aerosol species mass concentration fields output of WRF-CHIMERE for different size bins.AODEM calculates 3-D particle number concentrations from these mass concentrations and provides the extinction coefficients for each grid cell, assuming the spherical shape of particles (Landi, 2013).Three options of the aerosol mixing state were provided in AODEM: external, internal homogeneous, and internal coated spheres.Aerosol optical properties are simulated by AODEM following the Mie theory (Bohren and Huffman, 1983) for the wavelength range from 340 to 1640 nm.We selected the "aerosol internal mixing" option in the calculation because existing field measurements confirmed that aerosol is typically found in the internally mixed state (Lesins et al., 2002) largely due to coagulation and growth of aerosol particles (Jacobson, 2000).Note that AOD was calculated from the surface to the model's top layer of 500 hPa; hence it could not represent the transport through the convective processes taking place above the model top layer or the LRT in the free troposphere mentioned above.
For calculation of optical aerosol properties, AODEM provides the particle number concentrations separately for five components: BC, OC, sea salt, dust, and secondary inorganics (nitrate, sulfate, and ammonium).The AOD scattering was simulated using "brute force" by excluding BC in the simulation (Landi and Curci, 2011).BC AOD was calculated by subtracting the AOD scattering from the total AOD.

Model evaluation
The evaluation of WRF outputs was done using observed data from eight airport meteorological stations in five SEA countries that captured major subclimate zones (upper, near-Equator, and lower latitude) in the domain.Hourly observations from all these airport stations in 2007 were obtained from http://weather.uwyo.edu/surface/meteorogram/. The statistical evaluation of WRF outputs was done using the criteria provided by Emery et al. (2001), which include the mean bias error (MBE), mean absolute gross error (MAGE), and root mean square error (RMSE).The mod- Only limited air pollution data were available in SEA for the model performance evaluation.This study collected the observed concentrations of aerosol (BC, OC, PM 2.5 , and PM 10 ) and related gases from various sources.For example, daily (24 h) concentrations of PM 10 , PM 2.5 , BC, and OC in four SEA cities (i.e., Manila, Hanoi, Bandung, and Bangkok) in 2007 were taken from the measurement data generated by the Improving Air Quality in Asian Developing Countries (AIRPET) project "Improving Air Quality in Asian Developing Countries" (Kim Oanh et al., 2006, 2014).Hourly BC and OC concentrations were taken from the measurement results of the Asian Pacific Network (APN) project at the AIT located in Pathumthani province of the Bangkok Metropolitan Region, Thailand (Kondo et al., 2009).Hourly PM 10 in Bangkok (Thailand), Kuala Lumpur (Malaysia), and Surabaya (Indonesia) in 2007 were collected from the respective national monitoring networks.The statistical evaluation of simulated aerosol levels was done using mean fractional bias (MFB) and mean fractional error (MFE) (Boylan and Russel, 2006).Definitions of the statistical measures used in the model performance evaluation are given in Table S1 in the Supplement.
The monthly AERONET data for 2007 were downloaded from the National Aeronautics and Space Administration (NASA) website (http://aeronet.gsfc.nasa.gov/)for the evaluation of the modeled AOD.The AERONET data were level 2 quality controlled and recorded at 10 AERONET stations (using a sun photometer) listed in Table S2 in the Supplement.This AERONET dataset has already been pre-and post-field calibrated with cloud screening and quality assurance.The selected 10 AERONET stations had more complete datasets in 2007 and they represent all subclimate zones in the domain.The sun photometer measures AERONET AOD at six different wavelengths (1020, 870, 675, 500, 440, and 380 nm).Therefore, to compare with the modeled AOD at 550 nm, the AERONET AOD at 500 nm was converted to that 550 nm using a logarithmic interpolation (Chung et al., 2012).
For a qualitative evaluation of the spatial distributions we checked the consistency between the modeled AOD spatial distribution over the SEA domain and the monthly MODIS AOD (level 3 data measured at 550 nm wavelength downloaded from https://giovanni.sci.gsfc.nasa.gov/giovanni/#service=TmAvMp&starttime=&endtime= &variableFacets=dataFieldMeasurement%3AAerosol% 20Optical%20Depth%3B).

Base year emissions
The obtained total national emission estimates of Indonesia, Thailand, and Cambodia for 2007 were compared with the existing regional EI databases of EDGAR for 2007 and CGRER for 2006.Table 2 shows a reasonable agreement in the ranges of the estimates between the emission databases for three countries: Cambodia, Indonesia, and Thailand.Detailed EI results for Indonesia were presented in Permadi et al. (2017b).There are certain discrepancies between the databases that may be explained by several factors, including the uncertainty in activity data levels and emission factors used as well as the different coverage of the emission sources by different EI works.Specifically, for the emission sources of N 2 O, our EI for the three countries did not cover the direct emissions from cultivated soil (fertilized land) and the indirect N 2 O emissions from agriculture-related activities (microbial nitrification and denitrification), hence resulting in lower N 2 O emission estimates.Similar reasoning may be used to explain our lower estimates for CH 4 as compared to EDGAR for all three countries.
For Indonesia, our emission estimates were between those of CGRER and EDGAR for a number of species.The estimates for PM 10 , PM 2.5 , and BC actually agreed well between these databases while OC of CGRER appeared to be higher.However, the SO 2 estimates differed a lot between the databases and our value was lower than others (mainly for on-road transportation and industry), which may be attributed to a more bottom-up approach used in our EI that relied on actual sulfur content used in the country and implementation of air pollution control devices (Permadi et al., 2017b).For example, our SO 2 estimate for the power plants of Indonesia was 300 Gg yr −1 , which was more comparable with the CGRER estimate (409 Gg yr −1 ) but much lower than the EDGAR estimate (1000 Gg yr −1 ).The most striking difference was for the CO 2 emissions, which showed a much higher value by EDGAR; this could be clearly explained by the inclusion of two major sources in the EDGAR dataset: (i) the forest fire post-burn decay (698 000 Gg yr −1 ) and (ii) decay of drained peatland (504 000 Gg yr −1 ).If these two emission sources are excluded from the EDGAR results, the CO 2 estimates of all three databases are similar for Indonesia.The EI results for Thailand and Cambodia also showed reasonable agreements between the available datasets, except for CO 2 , which was estimated higher by EDGAR for Cambodia.The BC and OC emissions for these two countries were mostly comparable, i.e., differing by a factor less than 2.0 among the datasets.
Atmos.Chem.Phys., 18,2018 www.atmos-chem-phys.net/18/2725/2018/The emission shares by source category for the three countries are presented in Fig. S2 in the Supplement.The emissions of aerosol species (PM 10 , PM 2.5 , BC, and OC) were mainly from the residential and commercial combustion in Indonesia (43-80 %) and Cambodia (55-78 %) while for Thailand the biomass OB (forest fire and crop residue) emissions were dominant, i.e., 31-74 %.For SO 2 , the emission in Indonesia were mainly contributed by the transport sector (36 %) and thermal power plants (33 %) but the industry was the main contributor in both Thailand (66 %) and Cambodia (33 %).For NO x , the total emissions in Indonesia were dominated by the fugitive emissions from oil and gas operation (44 %), in Thailand by power plants (34 %), and in Cambodia by forest fires (60 %).The total emission of NH 3 , an important precursor for PM 2.5 , in all three countries were mainly from manure management and fertilizer application, i.e., 63 % for Indonesia, 75 % for Thailand, and 78 % for Cambodia.
The emissions from other SEA countries and from the non-SEA part of the domain (southern part of China) used in our modeling study are also included in Table 2.The emissions from Southern China had high shares in the total emissions from the modeling domain.It is seen that Indonesia and Thailand were collectively the largest emitters of all pollutants, sharing of 25-66 % of 2007 SEA emissions and 17-44 % of the modeling domain emissions.Thus, emission reduction measures implemented for these two countries are expected to contribute remarkably to air quality improvement in the region, which will be analyzed in the companion paper (Permadi et al., 2017a).The spatial distributions of the annual average emissions of BC and CO at 0.25 • × 0.25 • (∼ 30 km × 30 km) resolution are presented in Fig. 1, showing higher emission intensity over large urban areas in the domain.

Model statistical performance evaluation
The WRF hourly outputs, including surface T , RH, and wind speed (WS) for 2007, were compared with the observed data at eight international meteorological stations in five SEA countries (Table 3).The comparison was done for two seasons: 3 months, 1 January-31 March, to represent the dry season in the continental SEA (but the wet season in Indonesia) and 3 months, 1 August-31 October, to represent the wet season in the continental SEA (but the dry season in Indonesia).The time series of daily average modeled vs. observed meteorological parameters, as shown in Fig. S3a and b in the Supplement, showed that the model appeared to reasonably reproduce all parameters for the considered stations.In general, the model performance for T and WS simulations at all the stations was better than for RH during both periods.
The statistical performance evaluation for the hourly simulated values against the MBE, MAGE, and RMSE criteria is given in Table 3. MBE for the January-March period range was −1.9-+0.7 • C for T , −0.3-+2.7 m s −1 for WS, and −7.1-+7.7 % for RH.The corresponding range obtained for the August-October period was −0.1-+2.3• C, −0.6-+2.1 m s −1 , and −5.4-+2.6 %.Other statistical measures of MAGE and RMSE varied between the stations and the deviations from the suggested criteria were generally small.This suggested a relatively good model performance of WRF for both dry and rainy seasons.Overall, for the stations located in the northern latitudes (above the Equator line), the model performed better in the wet season (August-October), while for those located near and lower than the Equator line the model performance was equally good for both dry and wet seasons.The discrepancy between model results and obserwww.atmos-chem-phys.net/18/2725/2018/Atmos.Chem.Phys., 18, 2725-2747, 2018 N is the number of data points.Description of the statistics measures is presented in Table S1 in the Supplement.
vations was perhaps partly due to the fact that the domain covers some regions, such as the Indonesian maritime continent, that are principally characterized by active convection with a frequent presence of deep convection.These local processes, e.g., deep convection, are difficult to simulate using the mesoscale meteorological model of WRF with a rather coarse resolution (0.25 • ∼ 30 km) used in this SEA modeling study.Therefore, finer resolutions are required to capture the dynamical processes undergoing on smaller scales.Different physics options may be required for subregion domains to capture the processes and this should be done in future studies.In addition, a certain discrepancy is always expected because the model provided a grid average value, i.e., one value per grid, while the observation is point based at individual stations.

Synoptic-scale model evaluation
Spatial distribution of surface pressure over the WRF domain is presented together with the ERA-Interim dataset in Fig. S4 for 3 selected days (1 January 2013, 8 October 2007, and 7 November 2007; 00:00 UTC).Both modeled and ERA data showed similar spatial distribution patterns of pressure but WRF appeared to produce slightly lower surface pressure over central Papua of Indonesia for all three cases presented.In fact, both datasets showed lower pressure zones over the high mountain areas of the Himalayas, eastern parts of China, and central Papua of Indonesia that indicated the effects of the topography.
The simulated wind fields at 850 hPa (∼ 1500 m) are compared with the ERA-Interim upper wind fields in Fig. S5 that also showed a consistency of the two datasets and more in the center of the domain both for wind speeds and wind directions.A large discrepancy was seen in the northwestern corner of the modeling domain and this may be attributed to the boundary conditions (taken from NCEP FNL in this study).The modeled monthly precipitation for 2 selected months (August and October 2007) was compared with the TRMM-3B43 dataset in Fig. 2, which showed good agreement in the distribution patterns, but the model somehow underestimated the domain maximum monthly precipitation column that occurred, e.g., over Myanmar in August 2007 and over the central part of Vietnam in October.
The domain maximum hourly values of simulated PBL in different months of 2007 (Fig. S6) showed the PBL of 1800-3900 m.The maximum value of 3900 m occurred in March, which was lower than the model top level of 500 hPa (∼ 5500 m) mentioned above, while the lowest PBL was in August.

CHIMERE model results and evaluation
Aerosol simulation always presents a big challenge due to the complex multiphase chemistry and transport processes.Lack of ground monitoring data of aerosol in the SEA region is an obstacle to a comprehensive model performance evaluation.For model performance evaluation, the CHIMERE results of PM 10 , PM 2.5 , BC, and the ratios of PM 2.5 /PM 10 and BC/PM are discussed when comparing with available observed data in the domain in 2007.

PM 10
The daily (24 h) modeled PM 10 concentrations were estimated using the hourly data and the results were compared with the data gathered from the governmental monitoring networks that are available in three big cities of SEA (i.e., one station in Kuala Lumpur, two stations in Bangkok, and one station in Surabaya).Note that the same two periods, as for WRF evaluation above, were used to represent dry and rainy seasons for both northern and southern parts of the Equator.Overall, model results ranged from near 0 to 85 µg m −3 while the observations ranged from 5 to 90 µg m −3 in the three cities.The period average of modeled PM 10 in the three cities ranged from 21.7 to 29.2 µg m −3 while the corresponding observations ranged from 25.9 to 45.2 µg m −3 (Fig. 3).
Scatter plots of daily average observed and modeled values are presented in Fig. 3 showed that the model appeared to reasonably capture the range of 24 h PM 10 in the cities but it showed nonlinear correlation.The model underestimated the low observed values at the Kuala Lumpur station (one station); i.e., the observed levels were 30-60 µg m −3 , while the modeled levels fluctuated from near 0 to about 60 µg m −3 .A better agreement in the range of 24 h PM 10 was shown for Surabaya, i.e., both were from 5 µg m −3 to 85 µg m −3 , but the linear correlation was still quite low.For Bangkok, the modeled 24 h PM 10 ranged from 10 to 60 µg m −3 while the upper limit of the observed values was 90 µg m −3 .It is noted that although the ranges of the modeled 24 h PM 10 were comparable with the observed ranges, the correlations were not clear for all three cities.
The reason for the discrepancy in the day-to-day variations between the modeled and observed 24 h PM 10 values could be attributed to the lower accuracy of the temporal variations of the emission input data and the coarse resolution of the model, which, for example, may not be able to represent the weather variables in a convection-dominated climate.It is always challenging to compare the regionalscale modeling results obtained for a coarse resolution (i.e., 30 km×30 km) with the point-based observations, especially in complex mixed urban areas.A lack of systematic monitoring data for PM 10 in rural sites of the domain during the modeling periods prevented us from making a more comprehensive model performance evaluation.The statistical evaluation showed that in all three cities, the MFB and MFE values for 24 h PM 10 (in total 179 data points for each city) were within the suggested criteria (Table 4).The MFB values in Bangkok, Kuala Lumpur, and Surabaya were −53, −56, and −9 %, respectively, i.e., meeting the criteria of ≤ ±60 %.The MFE values in Bangkok, Kuala Lumpur, and Surabaya were 55, 56, and 18 %, respectively, which were also well within the criteria of ≤ +75 %.The simulated monthly averages of PM 10 in Kuala Lumpur and Bangkok were consistently lower than the observed values in all months (Fig. 3), which should be expected in principle due to the grid averaging of the www.atmos-chem-phys.net/18/2725/2018/Atmos.Chem.Phys., 18, 2725-2747, 2018 model results.For Surabaya, however, the model-simulated monthly PM 10 values were higher than the observed during the period of January-March 2007 but lower than the observed for the period of August-October.
Overall, the discrepancy between the modeled and observed PM 10 and other parameters may be caused by several factors including the input fields of meteorology and emission data for the simulation.The WRF model evaluation presented above showed an acceptable performance (see Sect. 3.2) but still with discrepancies with the observed data.More uncertainty, however, was expected from the EI input data.In addition, the uncertainty may arise from the monitoring data, especially with a large number of missing data points such as in Surabaya, Indonesia.Overall, the simulation of urban areas would require more refine emission input data to capture the local emission sources, such as roads or industries, and this should be addressed in future studies.

PM 2.5
Only some fragmented PM 2.5 measurement data were available in the domain in 2007 for the model evaluation (Fig. 4).This study used the 24 h PM 2.5 data monitored in the SEA cities of Bandung, Bangkok, Hanoi, and Manila, under the AIRPET project (Kim Oanh et al., 2006, 2014).The observation data were only available for some specific periods in 2007 at different sites and hence the modeled results were extracted for the corresponding periods for comparison.The observed sites were the mixed sites which were influenced by typical emission sources in the respective cities.The AIT site, located about 650 m away from a heavily traveled road, represented a suburban site with the influences of emissions from traffic and OB of rice straw (Kim Oanh et al., 2009).Thuong Dinh (TD) of Hanoi was a mixed urban site influenced by traffic and residential combustion among other sources (Hai and Kim Oanh, 2013).Both Tegalega (TG), located in Bandung, Indonesia, and Manila observatory (MO) in Manila, Philippines, were mixed urban sites with strong influence of traffic and other typical urban sources.The data therefore represent different periods of the year and different urban characteristic sites and are only for model performance evaluation, not to compare the levels between the cities.Overall, the available observed 24 PM 2.5 data in four AIR-PET cities ranged from 4 to 120 µg m −3 while the modeled values for the same data periods ranged from 5 to 64 µg m −3 .The average levels of the observed PM 2.5 over all the data periods ranged from 35 to 43 µg m −3 as compared to the mod-eled, i.e., from 9.7 to 21 µ gm −3 .Scatter plots of observed and modeled 24 h PM 2.5 at four AIRPET stations (Fig. 4) clearly showed that the model underestimated 24 h PM 2.5 in all stations.In the mixed polluted urban site in Bandung (TG), modeled 24 h PM 2.5 were within the range of 11-33 µg m −3 while the observed were 27-69 µg m −3 .In the TD urban site in Hanoi (close to a busy road), the simulated 24 h PM 2.5 were 5-64 µg m −3 as compared to the observed of 20-120 µg m −3 .In the mixed urban site of MO in Manila the simulated 24 h PM 2.5 were 6-37 µg m −3 as compared to the  observed range of 4-55 µg m −3 .As discussed above, the four selected AIRPET sites were located quite close to heavily traveled roads (although they were not directly on the roadside) and hence the local traffic emissions could directly affect the monitored pollution levels.This may be an important reason for the discrepancy between the monitored levels and the simulated grid average values.In addition, the observed data points were quite limited for 2007 (≤ 30 at each site) and were thus not sufficient for the statistical model performance evaluation.The PM 2.5 monitoring efforts should be enhanced to characterize the pollution in SEA and also provide sufficient data points for the model evaluation.

Black carbon
For model evaluation purposes, we used available measurements in the previous projects for SEA.The 24 h BC measured by the optical method was available at several SEA sites under the AIRPET project (Kim Oanh et al., 2014).The hourly-based EC (elemental carbon, measured by a Sunset analyzer) measurements, available from the APN project (Kondo et al., 2009)  calculate 24 h BC levels.EC was measured using thermal optical method while BC was measured using light absorption method (continuous soot monitoring system or COSMOS).The model performance evaluation was done using 24 h BC data of both APN and AIRPET projects.
The APN hourly EC dataset for the AIT site was available for both dry and wet seasons, from March to December 2007.The hourly EC and hourly BC measured simultaneously by the APN project at AIT were found to have a strong linear correlation (Kim Oanh et al., 2009).Therefore, we used the observed Sunset EC to compare with the modeled output of BC.This is because for the PM mass closure, EC seems to be better while BC is suitable for radiative transfer budget analysis (Gelencsér, 2004).Figure 5 presents the time series of the modeled and observed 24 h BC for the AIT site.The modeled 24 h BC was from 1.0 to 10 µg m −3 that is comparable with the observed range from 0.8 to 10 µg m −3 .However, correlation between the modeled and observed BC shown in the scatter plot was fairly low.The discrepancy between the modeled and observed BC seen in the time series may principally be due to the gridded average of the model out-put as compared to the point-based measurement.Higher BC levels measured at the AIT site were contributed by multiple local sources, such as nearby highway traffic activity and biomass OB (of rice straw) that occurred more intensively during the dry season (December).However, these sources, especially small-scale rice straw field burning activity, may not be well represented spatially by the EI input data made for a large resolution (30 km × 30 km).Three statistical measures of MBE, MFB, and MFE were considered for the model performance evaluation in the BC simulation at the AIT site (Table 4).The MFB and MFE values were −24 and 49 %, respectively, which meet the suggested criteria (for PM).The MBE value was −0.12 µg m −3 for the AIT site, which showed that the model somewhat underestimated the observed BC values, but there are no MBE criteria available for PM for comparison.
The 24 h BC (optically) measured on the 24 h PM 2.5 sampled filters collected in the same locations of PM 2.5 measurements in SEA under the AIRPET project (Kim Oanh et al., 2006, 2014) were compared with the 24 h modeled BC extracted for the sites and dates of 2007. Figure 6  the modeled 24 h BC were lower than the observed at all the sites.The ranges of observed values and modeled values were in somewhat better agreement for the AIT site and MO site than the other two sites.At AIT, the observed BC values were 1.3-3.4µg m −3 (January, February, and May) were higher but quite comparable to the modeled range of 0.5-1.8µg m −3 .At MO, the observed 24 h BC of 7-13 µg m −3 (January and February) was quite close to the modeled 24 h BC of 4.2-13 µg m −3 .More discrepancies were found for the Bandung site, with observed 24 h BC values ranging from 4.2 to 9.8 µg m −3 (July 2007) as compared to the modeled values of 1.3-3.2µg m −3 .Similarly, the observed BC values at the mixed site of TD, Hanoi, ranged from 12 to 23 µg m −3 (January 2007), much higher than the modeled values of 1-7 µg m −3 .The effects of local sources, especially traffic emissions, at the quoted sites should be a main cause of the discrepancies when compared to the grid average modeled BC with the observed values.The limited measurement data available prevented a more comprehensive model performance evaluation.Note that due to the limited measure-ment data points, a statistical performance evaluation was not conducted for the BC simulation.

Ratios between fine and coarse PM and between BC and PM
In fact, PM 2.5 mass is principally contributed by both local combustion sources and secondary particles formation by chemical reactions in the atmosphere.The gaseous precursors of NO x , SO x , and VOCs for the PM 2.5 formation may be of both local and LRT origins.The coarse fraction (PM 10−2.5 ) would mainly consist of primary particles of the geological origin (Chow et al., 1998), and these are mainly contributed by local sources of soil, road dust, and construction activities (Hai and Kim Oanh, 2013).Due to its formation process as well as the ability to participate in the regional transportation, the fine particles (PM 2.5 ) are more uniformly distributed in an urban area than the coarser particles.The PM 2.5 /PM 10 ratios could provide some information of the dominance of local sources of PM 2.5 .We com-Atmos.Chem.Phys., 18, 2725-2747, 2018 www.atmos-chem-phys.net/18/2725/2018/pare the PM 2.5 /PM 10 ratios based on the modeled 24 h PM 2.5 and 24 h PM 10 (PM 10 = PM 2.5 + PM 10−2.5 ) with those computed from the observed PM data available at the four AIR-PET monitoring sites discussed above.Overall, the modeled PM 2.5 /PM 10 ratios ranged from 0.47 to 0.59 while the observed values were higher, 0.6-0.83.More pronounced differences were for TD, i.e., 0.74 observed vs. 0.47 modeled, and for TG of Bandung, 0.83 observed vs. 0.55 modeled.
Better agreements were obtained for MO, 0.61 observed vs. 0.47 modeled, and the AIT site, 0.60 observed vs. 0.59 modeled.The urban mixed sites of TD in Hanoi and TG in Bandung were located in the traffic areas and thus higher contributions of the primary PM 2.5 emitted from traffic to the total measured PM 10 may be seen compared to the MO and AIT sites.However, to evaluate the variations in the PM 2.5 /PM 10 ratios, contributions of various sources of the coarse particles, such as road dust and construction dust, should be further analyzed.It is noted that the ratios used to compare with the model-simulated values were all derived from the observations made in large cities in SEA.Lack of observation data in rural areas and remote sites presents an obstacle for more in-depth analysis of the model performance.The remote sites, with less influence of the local sources, could be valuable for the model performance evaluation, both for the PM mass concentrations and their ratios.BC is emitted directly from the combustion sources with higher fractions in PM emitted from the diesel exhaust (Kim Oanh et al., 2010) and lower fractions from biomass OB (Kim Oanh et al., 2011).Hence the ratio of BC/PM 2.5 , for example, can infer the contribution of the primary particles from these combustion activities.BC/PM 2.5 and BC/PM 10 ratios were calculated using the observed 24 h data at four AIRPET sites.Modeled BC/PM 2.5 ratios ranged from 0.05 to 0.33 as compared to the observed ratios of 0.05-0.28.For BC/PM 10 , the modeled values ranged from 0.03 to 0.16 while the observed values ranged from 0.034 to 0.17.Observed BC/PM 2.5 ratios were higher than the modeled values at TG of Bandung (0.16 vs. 0.1) and AIT (0.055 vs. 0.05) sites.In TD and MO, the observed ratios (0.22 and 0.23) were lower than the modeled (0.28 and 0.33).As for BC/PM 10 , the observed ratios at three AIRPET sites of TG, TD, and AIT (0.13, 0.17, and 0.034) were higher than the modeled values (0.06, 0.13, and 0.03), while for MO the opposite was shown with a lower observed (0.14) as compared to the modeled (0.16) value.The simulated BC/PM ratio was the highest in TD, 0.22 % for PM 2.5 and 0.17 % for PM 10 , during the dry period of January-February 2007, which confirmed the strong influence of traffic emission at this site.
The lack of data for the areas outside the cities is a remaining issue.Generally, we expect that PM 2.5 mass may be more uniform in an urban area; for example, measurements conducted in several mountain areas in Asia showed high PM 2.5 concentrations which were mainly due to the regional transport (Hang and Kim Oanh, 2014;Co et al., 2014) or local combustion sources (e.g., residential cooking, biomass OB) such as found in China (Liu et al., 2017).However, the BC fraction of PM may vary a lot with much lower values in remote sites but a lack of data prevents a more in-depth analysis.
As seen in the statistical model evaluation, a negative MBE was obtained for PM 10 , −3 to −17, and BC, −0.12, at all sites (not enough data for statistical evaluation of PM 2.5 ), which showed an underestimation of PM 10 and BC concentrations by the model at the sites.This may be explained by the coarse resolution (30 km × 30 km) of emission input data which could not adequately represent the spatial distributions of local sources on a smaller scale, such as road traffic.These local sources, for example road traffic and residential cooking, affect PM measured at all sites, hence affecting the PM 2.5 /PM 10 and BC/PM ratios.The road and soil dust emissions contribute more to PM 10−2.5 , thus lowering PM 2.5 /PM 10 ratios in urban areas, but this coarse fraction of PM emission was not included in our emission input file.In addition, the LRT pollution above the model top layer (> 500 hPa) may contribute to the pollution in the domain, more to PM 2.5 and BC than the coarse PM.The freetropospheric LRT of aerosol and high convective processes should be also considered by extending the vertical model setup in future studies.
3.4 Spatial distribution of modeled monthly PM 10 , PM 2.5 , and BC Spatial distributions of the modeled monthly average PM 10 , PM 2.5 , and BC are presented in Fig. 7 for January, August, and November while those of the respective annual averages are presented in Fig. S7 in the Supplement.The highest monthly average concentrations of PM 10 in January, August, and November 2007 simulated in the domain (one value for the whole domain) were 69, 58, and 44 µg m −3 while corresponding values of PM 2.5 were 40, 37, and 27 µg m −3 , respectively.The simulated maximum monthly average BC concentration in the domain was higher in January (8.2 µg m −3 ) as compared to August (7.8µg m −3 ) and November (5.9 µg m −3 ).The simulated highest hourly PM 10 values in the considered months of January, August, and November 2007 were 325, 245, and 164 µg m −3 , respectively, while the PM 2.5 corresponding values were 188, 150, and 99 µg m −3 .The highest values of simulated annual average in the domain for PM 10 and PM 2.5 were 51 and 32 µg m −3 , respectively.The maximum simulated annual average in the domain for BC was 6 µg m −3 .A summary of the simulated pollutant levels in the domain is presented in Table S3 in the Supplement.
For all considered pollutants over the domain, higher concentrations were observed over East Java, Indonesia, particularly over Surabaya, which shows the effects of emission from residential and traffic in the city and surrounding satellite cities as well as the crop residue OB (Permadi and Kim Oanh, 2013;Permadi et al., 2017b).High concentrations www.atmos-chem-phys.net/18/2725/2018/Atmos.Chem.Phys., 18, 2725-2747, 2018 were consistently observed in several places in Indonesia including Java, West Sumatra (Padang), and West Kalimantan (Pontianak) and over Bangkok, Thailand.Large hotspots but with lower concentrations were also observed over Southern China and over Hanoi and Ho Chi Minh (Vietnam), which can be largely explained by the influence of the local sources (Fig. 7).
The monsoon circulation plays an important role in transporting PM from the emission source regions to other parts of the domain.In the dry months, higher emissions of biomass OB are expected, and higher concentrations of PM should be seen in the region near and downwind of sources.Accordingly, in the northern part of the domain, higher PM levels were found in January-April, while in the southern part of the domain higher concentrations were found during the period of April-August.In January in the Northern Hemisphere, the Northeast Monsoon transports pollutants from the source regions to the southwest, while in the Southern Hemisphere (Indonesia) the plume moved to the northeasteast.The opposite is seen in August and November (Fig. 7).In August in the Southern Hemisphere, the plumes of PM moved northwesterly and turned northeasterly after reaching the Equator line.The plumes of PM 10 and PM 2.5 converged in the South China Sea in January and November when the Northeast Monsoon was prevalent that brought PM pollution from the southern part of mainland China to the South China Sea (Figs.S8a-d).WRF results showed no rainfall over the South China Sea during the particular period, which may also contribute to the high PM levels in the converged zone (Figs.S8e-f).
In August and November, the dry months in the southern domain, the PM 10 and PM 2.5 plumes showing the effects of biomass OB (crop residue and forest fire) emissions in Indonesia that originated in Riau province (Sumatra) and western and southern parts of Borneo were seen clearly moving northeastward.In January, the dry season month in the northern domain, the plumes of PM 10 and PM 2.5 intensified by biomass OB in the central and northern parts of Thailand were shown moving southwestward.BC plumes generally originated from big cities in the domain, showing a significant influence of fossil fuel combustion emissions, specifically from traffic and other urban activities for all months Atmos.Chem.Phys., 18, 2725Phys., 18, -2747Phys., 18, , 2018 www.atmos-chem-phys.net/18/2725/2018/ of the year.During the dry period, BC plumes from the areas that have intensive biomass OB emissions were not as clearly seen as the PM plumes and this may be because biomass OB contributed more to OC than BC emissions.Effects of precipitation on the PM levels were also seen; e.g., higher PM levels (Fig. 7) were simulated over Indochina in January, October, and November as compared to August because the latter was a rainy month in this part of the domain, i.e., less biomass OB and more wet removal in principle.The opposite was actually seen in the southern part of modeling domain, e.g., above Indonesia, where lower PM levels were simulated in October (rainier month in this part) than other months.

Aerosol optical depth
Both total AOD and BC AOD were considered for the model evaluation.The monthly average of the total columnar AOD (scattering and absorbing), at the wavelength of 550 nm, was produced from the AODEM simulation for 2007.The simulated monthly AOD data were compared with the monthly Terra MODIS AOD, also at 550 nm, retrieved from the NASA website.Figure 8 showed that the modeled AOD was lower than the MODIS observed; for example in January, the maximum AOD simulated for the Southern China part of the domain was about 0.36 as compared to the MODIS AOD of 0.42-0.58.In the same month, the modeled AOD values over Java, Indonesia, were 0.072-0.28while the MODIS AOD values were 0.26-0.42.In April, the model results over Southern China were 0.25-0.75while the observed MODIS AOD was 0.42-0.90.Near the border between Myanmar and Bangladesh (northwest corner of the domain), the modeled AOD and the observed MODIS AOD were similar, 0.74-0.75.However, the modeled AOD values over Java in April were higher, i.e., 0.02-1.0,than the observed MODIS AOD of 0.26-0.42.The simulated hourly maximum and monthly average PM 10 and PM 2.5 concentrations, and hence AOD, over Java were the highest throughout the year.In particular in April, there was a hotspot of AOD simulated over the location that may be due to the meteorological conditions.For example, the restricted dispersion conditions in April could be seen from the smaller dispersion plumes in this month as compared to other months in Fig. 8.
In October, a hotspot with the maximum AOD of 0.8 was observed by MODIS over Riau, Sumatra, and Singapore that was well above the model result for the grid of 0.4.The model was also not able to capture AOD hotspots over mainland Southern China in this month.The results for August and November both showed some significant underestimation of AOD as compared to the MODIS-observed values.There are several reasons for these discrepancies, including the temporal and spatial inconsistency in the observed and modeled values used for comparison.For example, the Terra MODIS satellite daily passed a region for a particular time (i.e., 13:30), thus giving only a snapshot of the value, while the model provided the hourly average for 13:00-14:00.Thus there is certainly inconsistency in the monthly averages derived from these two datasets.The discrepancy may come from the fact that in the simulation AOD was covering up to 500 hPa and could not include aerosol in the upper layers as mentioned above.Different spatial resolutions of modeled AOD (30 km×30 km) and MODIS AOD (10 km×10 km) can be another reason.In addition, shipping emissions and the natural sources of aerosol, such as wind-blown dust, were not included in our emission input data so the model would produce lower AOD (as well as PM 10 ) values.Consistent with www.atmos-chem-phys.net/18/2725/2018/Atmos.Chem.Phys., 18, 2725-2747, 2018 the PM results, the effects of precipitation on AOD were captured, i.e., higher in the dry months and lower in the wet months in the respective parts of the domain.Overall, this qualitative analysis of the modeled vs. MODIS AOD provided only some insight into the regional distributions.Further efforts to conduct more comprehensive model evaluation are still required.Simulated monthly AOD values were also compared with the observed data retrieved from 10 AERONET stations located in the domain, i.e., in Vietnam, Singapore, Hong Kong, Taipei, Thailand, and Indonesia, which also showed lower simulated AOD values than the AERONET-observed values (Fig. 9).The model appeared to better capture seasonal variability at Bac Lieu (Vietnam), Silpakorn University, and Songkhla (Thailand) stations, while at other stations the model underestimated the AOD.The model seemed to be able to simulate well the monthly average AOD at Hok Sui station (Hong Kong) but only for the months of October, November, and December 2007.The strong seasonal variation of aerosol in SEA, largely caused by the biomass OB and meteorological conditions, creates a huge challenge for models to reproduce.At Puspiptek Serpong (Indonesia), where emissions of urban activities from the capital city of Jakarta would dominate, the high AOD in October was reasonably captured by the model.The seasonal variation in the emission input file for anthropogenic sources would need to be further refined to improve the situation.Better proxies should be used for transport, industry, residential combustion, and thermal power generation, which would reflect the actual variation in the monthly activity data of each sector.
The BC AOD (absorbing) was calculated as the difference between the total AOD (scattering + absorbing) and the scattering AOD following the same method presented in Landi and Curci (2011).The spatial distribution of monthly average BC AOD is presented in Fig. S9 in the Supplement.In January, the dispersion plumes of high BC AOD spread over Southern China (maximum AOD of 0.027) and eastern parts of Indonesia (maximum 0.018), which shared 7.5-10 % of the total AOD of 0.36 and 0.18, respectively.In April, the highest value of the modeled BC AOD was seen over Surabaya (East Java province, Indonesia) with a range of 0.051-0.078,followed by relatively high values over Hong Kong and Shenzhen of 0.06-0.069.The contributions of the BC AOD to the total AOD in Surabaya, Hong Kong, and Bangladesh were 9 % (of 0.89), 11 % (of 0.6), and 12 % (of 0.54), respectively.
In other months, the highest monthly average BC AOD was shown in different parts of the domain ranging between 0.015 and 0.027 while the total AOD was 0.18-0.36;hence the shares of BC AOD in the total AOD were 7.5-8.6%.Our BC AOD contributions to the total AOD were higher than the reported global average value of 3 % (Reddy et al., 2005), but in the same range of those reported for different regions with intensive emission sources.The relative contribution of BC to total AOD has been reported to depend on both wavelength, i.e., increasing with decreasing wavelength, and the dominant emission sources.For example, measurements showed typical contributions of around 12 % under the influence of natural dust (Chiapello et al., 1999) and around 5-12 % when biomass OB is dominant (Eck et al., 1999;Dubovik et al., 2002).The modeled BC AOD serves as input to estimate BC direct radiative forcing of anthropogenic emissions for the SEA domain, which will be analyzed in our companion paper (Permadi et al., 2017a).

Summary and conclusions
This study developed and evaluated the EI databases for Indonesia, Thailand, and Cambodia for 2007.The results were compiled with the existing CGRER and EDGAR emission datasets to generate the emission input data of the entire SEA domain for regional WRF-CHIMERE modeling.Our EI results for the three countries were comparable to other existing databases and the differences are explained mainly by the differences in the sources covered by different EI works.The BC emissions were mainly from residential and commercial combustion in Indonesia (71 %) and Cambodia (70 %) but were dominated by biomass OB emissions in Thailand (31 %).
The model performance for 2007 was evaluated using the hourly and daily observed data in the SEA domain.The WRF model outputs were in good agreement with the observed data at eight international airport stations in Indonesia, Thailand, Vietnam, Cambodia, and the Philippines.The WRF-CHIMERE model satisfactorily reproduced the aerosol species of PM 10 , PM 2.5 , and BC in terms of the spatial distributions and seasonal variations.The statistical evaluation was conducted for 24 h PM 10 and 24 h BC, which had sufficient observed data points for the analyses.The modeled 24 h PM 10 in three cities (Thailand, Malaysia, and Indonesia) had MFB and MFE values that met the suggested criteria.Similarly, the modeled 24 h BC values met the MFB and MFE criteria for PM when compared the observed data at a suburban site in Thailand (AIT).
The PM 2.5 /PM 10 ratios calculated from the modeled outputs were lower than those estimated from the observed data at four AIRPET sites and this would imply a necessity of further improvement of the PM speciation of the emission input data.The modeled BC/PM 2.5 ratios were in compatible range (0.05-0.33) with the observed values (0.05-0.28) and were lower in two sites (AIT and Bandung) but higher in the others (Hanoi and Manila).The modeled BC/PM 10 ratios ranged between 0.03 and 0.16, which were comparable to the observed values range (0.034-0.17).Lack of systematic observed BC data prevented a more comprehensive model performance evaluation.Nevertheless, further improvement of the EI for primary aerosol, especially the PM speciation of major sources, as well as inclusion of unpaved road and wind-blown dust emissions are required.Vertical model Atmos.Chem. Phys., 18, 2725-2747, 2018 www.atmos-chem-phys.net/18/2725/2018/setup should be extended beyond 500 hPa (∼ 5500 m) in future studies to better incorporate the free-tropospheric LRT of aerosol.
The spatial distributions of the total columnar AOD estimated for the WRF-CHIMERE output PM concentrations using AODEM were comparable with the observed (MODIS and AERONET) in 2007.In particular, exclusion of the unpaved road and wind-blown dust emissions (coarse particles) from the emission input in this study was a reason for the discrepancy in the modeled and observed total AOD due to the possible underestimation of the coarse PM concentrations.The lower values of aerosol species simulated by the model were explained by the grid averaging effects: WRF-CHIMERE had a larger grid of 30 km, as compared to MODIS AOD of 10 km, while AERONET is actually point based.Thus, the spatial distribution of local sources of a smaller size cannot be captured well by WRF-CHIMERE.
The spatial distribution patterns of the modeled aerosol species in the domain may be explained by the intensive biomass OB emissions.The plumes of PM 10 and PM 2.5 originated from Sumatra and Borneo of Indonesia in August-November and from central and northern Thailand during January-April, which coincided with the dry months in the respective areas and subsequently more biomass OB.Spatial distributions of BC showed the influence of the traffic emission and residential combustion in big SEA cities.Based on the model results, the contribution of BC AOD to total AOD in the domain was around 7.5-12 %, which is consistent with the literature reported values for intensive emission areas.Effects of precipitation were captured by the model that produced lower PM and AOD levels in the months with higher precipitation simulated.
The EI data and WRF-CHIMERE performance for 2007 were satisfactory in terms of reproduction of the key aerosol species in the domain.In the companion paper (Permadi et al., 2017a), we present the WRF-CHIMERE simulation results for PM and BC for the SEA domain in the business as usual emission scenario (BAU2030) and in the emission reduction scenario (RED2030) to quantify potential co-benefits for air quality improvement, reduction in the number of premature deaths, and radiative forcing mitigation in Southeast Asia.
Data availability.All involved data including emission and model results are available upon request to the corresponding author.
Competing interests.The authors declare that they have no conflict of interest.

Figure 2 .
Figure 2. Comparison of modeled monthly precipitation and the TRMM-3B43 dataset in August and October 2017.

Figure 3 .
Figure3.Comparison of modeled and observed 24 h PM 10 in Kuala Lumpur, Malaysia (one station), Surabaya, Indonesia (one station), and Bangkok, Thailand (three stations).Note that the stations included in the comparison are those located within the cell.

Figure 5 .
Figure 5.Time series comparison and scatter plot of modeled vs. observed 24 h elemental carbon in AIT site, 2007.

R²Figure 6 .
Figure 6.Comparison of 24 h simulated and observed BC at four AIRPET sites in SEA domain, 2007.

Figure 7 .
Figure 7. Spatial distribution of monthly average PM 10 , PM 2.5 , and BC in the selected months, 2007.

Figure 8 .
Figure 8. Spatial distribution of monthly modeled AOD as compared to the MODIS Terra AOD for the selected months, 2007.

Table 1 .
Summary of activity data level from different emission sources in three countries.
and upper wind fields (at 850 hPa) were compared with the European Centre for Medium-Range Weather Forecasts (ECMWF) Reanalysis (ERA) Interim data extracted from http://apps.ecmwf.int/datasets/for the same vertical levels.Simulated monthly accumulated precipitation fields were compared with the satellite-based observations provided by the Tropical Rainfall Measuring Mission (TRMM-3B43) (https://disc.sci.gsfc.nasa.gov/datasets/TRMM_3B43_V7/summary?keywords=TRMM).

Table 2 .
(Olivier et al., 2001))r in comparison with the existing regional EI datasets (Gg yr −1 ).EDGAR b CGRER c This study a EDGAR b CGRER c This study a EDGAR b CGRER c countries d part of China domain EI conducted for base year of 2007 using the ABC EIM framework(Shrestha et al., 2013).Detailed methodology and results were presented inPermadi et al. (2017a).bEDGARfor base year of 2007(Olivier et al., 2001).Retrieved from http://edgar.jrc.ec.europa.eu/overview.php?v=42FT2012.The CO 2 emissions, excluding forest fire post burn decay and decay of drained peatland, are given in brackets for comparison with our estimates.

Table 3 .
Statistical parameters for WRF model performance evaluation for two periods (bold values are those meeting the evaluation criteria).

Table 4 .
Statistical parameters for CHIMERE model performance (PM 10 and BC) evaluation.
a Period taken was from January to March and August to October 2007 for all stations (daily average concentrations).b Urban mixed site.c Urban mixed site.d Background concentration.e Urban mixed site.f Period taken was from March to December 2007 (daily average concentrations).