Impact of spatial proxies on the representation of bottom-up emission inventories : A satellite-based analysis

Spatial proxies used in bottom-up emission inventories to derive the spatial distributions of emissions are usually empirical and involve additional levels of uncertainty. Although uncertainties in current emission inventories have been discussed extensively, uncertainties resulting from improper spatial proxies have rarely been evaluated. In this work, we investigate the impact of spatial proxies on the representation of gridded emissions by comparing six gridded NOx emission datasets over China developed from the same magnitude of emissions and different spatial proxies. GEOS-Chem-modeled tropospheric NO2 vertical columns simulated from different gridded emission inventories are compared with satellite-based columns. The results show that differences between modeled and satellite-based NO2 vertical columns are sensitive to the spatial proxies used in the gridded emission inventories. The total population density is less suitable for allocating NOx emissions than nighttime light data because population density tends to allocate more emissions to rural areas. Determining the exact locations of large emission sources could significantly strengthen the correlation between modeled and observed NO2 vertical columns. Using vehicle population and an updated road network for the on-road transport sector could substantially enhance urban emissions and improve the model performance. When further applying industrial gross domestic product (IGDP) values for the industrial sector, modeled NO2 vertical columns could better capture pollution hotspots in urban areas and exhibit the best performance of the six cases compared to satellite-based NO2 vertical columns (slope = 1.01 and R2 = 0.85). This analysis provides a framework for information from satellite observations to inform bottom-up inventory development. In the future, more effort should be devoted to the representation of spatial proxies to improve spatial patterns in bottom-up emission inventories.


Introduction
Emission inventories are essential for predicting spatial and temporal variations in air pollutants and for helping policy makers develop pollution control strategies.The traditional way of developing an emission inventory is the bottom-up approach, whereby activity rates and emission factors are aggregated for all known sources (e.g., Streets et al., 2003;Zhang et al., 2009).Emission inventories are most commonly estimated as emission totals of municipal districts Published by Copernicus Publications on behalf of the European Geosciences Union.
G. Geng et al.: Impact of spatial proxies on bottom-up emissions (e.g., counties, provinces, or countries) because activity data from statistical yearbooks are typically available for such districts (e.g., Woo et al., 2003;Ohara et al., 2007;Kurokawa et al., 2013).However, gridded emissions are needed to apply inventories in chemical transport models.
Many methods for allocating regional emission totals to grids are available.The most accurate approach involves allocating emissions for which the actual latitude-longitude coordinates of the emitting facilities (e.g., power plants or cement plants) are available.For mobile and area sources (i.e., sources for which the exact emission locations are unknown), parameters of so-called spatial proxies must be used to represent the spatial distributions of emissions.For example, emissions from road transportation sources can be allocated based on road networks, and residential emissions that are strongly related to human activities can be gridded using population densities or nighttime lights (Streets et al., 2003;Woo et al., 2003;Ohara et al., 2007;Zhang et al., 2009;Oda et al., 2011).In many cases, industrial emissions are also allocated by proxies because of the limited information available.
The selection of such spatial proxies is empirical, and their representations of real-world spatial emission patterns are of considerable concern (Zhou and Gurney, 2011;Andres et al., 2012).Recent efforts have been made to reveal the uncertainties of spatial proxies used in bottom-up CO 2 inventories (Rayner et al., 2010;Zhou and Gurney, 2011;Andres et al., 2012Andres et al., , 2016;;Gately et al., 2013Gately et al., , 2015)), and sophisticated methods of allocating emissions to high-resolution grids have been formulated (Gurney et al., 2009;Rayner et al., 2010;Nassar et al., 2013;Asefi-Najafabady et al., 2014).Using population density to downscale fossil fuel emissions can induce biases when analyses are conducted at sub-national scales (Rayner et al., 2010).A high-resolution fossil fuel CO 2 emission inventory for the United States further confirms that source heterogeneities are significant and vary by region and sector, indicating that population density is a biased spatial proxy below the state level (Zhou and Gurney, 2011).The studies above suggest that using population density as a spatial proxy may not be appropriate under the hidden assumption that per capita emissions are homogeneous within a region.However, population density remains one of the most widely used spatial proxies in global and regional emission inventories (Zhang et al., 2009;Lu et al., 2011;Kurokawa et al., 2013), and uncertainties transmitted from improper spatial proxies to chemical transport models have rarely been evaluated.
A recent remarkable development in satellite-based remote sensing instruments, or the so-called top-down approach, provides additional constraints to evaluate and improve the existing understanding of emission inventories (Martin, 2008;Streets et al., 2013).Tropospheric column densities of important trace gases, such as NO 2 , SO 2 , CO, and HCHO, derived from satellite instruments generate an abundance of useful information on the emission sources of these gases (e.g., Duncan et al., 2010;Boeke et al., 2011;Lin, 2012;Pechony et al., 2013;Stavrakou et al., 2015;Wang et al., 2015;Liu et al., 2016), despite the biases intrinsic to satellite retrievals (Boersma et al., 2008;Lin et al., 2014).Many studies have compared model-simulated column densities with satellite-derived columns to validate the accuracy of bottom-up emissions (e.g., van Noije et al., 2006;Uno et al., 2007;Kim et al., 2009;Sheel et al., 2010;Lin et al., 2010;Itahashi et al., 2014;Han et al., 2015) and have attributed discrepancies between modeled and satellite-based column densities to errors in the magnitudes and/or spatial distributions of the emission inventories used in their models.Inverse modeling techniques can further derive "top-down" emission inventories with optimized magnitudes and emission spatial distributions (e.g., Martin et al., 2003Martin et al., , 2006;;Jaeglé et al., 2005;Wang et al., 2007;Lin, 2012;Tang et al., 2013;Stavrakou et al., 2015).For example, Lin (2012) found that the widely used Intercontinental Chemical Transport Experiment Phase B (INTEX-B) inventory may underestimate NO x emissions in polluted urban areas or near large point sources.Lamsal et al. (2013) demonstrated that urban NO 2 pollution is a power law scaling function of population size.The exponent values vary by region, reflecting regional differences in industrial development and per capita emissions.Although these top-down studies have identified uncertainties in the spatial representation of current NO x inventories and have provided correction factors, these factors are difficult to incorporate into bottom-up inventories or apply to emissions of other species because less attention has been paid to gridding processes in bottom-up approaches, which vary by sector and are shared across different species.
In this work, we use NO x as a case to study the influence of spatial proxies on spatial distributions of bottom-up emission inventories because NO 2 satellite retrieval is less uncertain and the spatial distributions of tropospheric NO 2 vertical columns are similar to those of surface NO x emissions, especially in the summer when the lifetime of NO x is short (Richter et al., 2005;Beirle et al., 2003;Lin, 2012).Based on the same magnitude of NO x emissions, we develop six sets of gridded emission data using different spatial proxies.We then use these gridded emissions and the nested-grid GEOS-Chem model to simulate tropospheric NO 2 vertical columns and compare them with satellite-based observations.The effects of spatial proxies on the modeled NO 2 vertical columns and representations of different spatial proxies are evaluated and discussed.
2 Methods and data

Review of spatial proxies
We first review the sector-specific spatial proxies utilized in several widely used regional NO x emission inventories covering China, including TRACE-P (Streets et al., 2003), Table 1.Review of the spatial proxies used in regional bottom-up NO x emission inventories covering China.

Inventories
Sectors Spatial proxies Data sources TRACE-P Large power plants Location RAINS-Asia (Shah et al., 2000) and GEIA (Streets et inventory (Graedel et al., 1993(Graedel et al., ) al., 2003) ) Small power plants  (Ohara et al., 2007), and REAS version 2 (Kurokawa et al., 2013), as shown in Table 1.In bottom-up inventories, spatial proxies are usually sector dependent and shared across different species.In general, all of these inventories rely on similar approaches to allocate emissions from combustion sources.Emissions from largecapacity power generation units are typically allocated according to their latitude-longitude information, which is the most accurate way to distribute emissions, whereas the population density is used to distribute emissions from small power plants whose locations are unknown (Streets et al., 2003;Ohara et al., 2007;Zhang et al., 2009).Recently, power plant locations from the Carbon Monitoring for Action (CARMA) database (Wheeler and Ummel, 2008) are used to locate emissions (Kurokawa et al., 2013), providing a larger dataset of power plants than previous work.However, the ac-curacy of the CARMA database is still uncertain (Oda et al., 2011).
For industrial and residential combustion, population density is the most frequently used spatial proxy because such emissions are believed to be highly correlated with human activities.Total population density is applied for industrial combustion in the TRACE-P and REAS emission inventories (Streets et al., 2003;Ohara et al., 2007;Kurokawa et al., 2013), which may allocate a large fraction of industrial emissions to rural areas because, in China, the rural population exceeds the urban population (China Statistical Yearbook;National Bureau of Statistics, 2007).In the INTEX-B inventory (Zhang et al., 2009), the urban population is used rather than the total population because industrial activities occur more often in urban areas; however, this strategy is still based on an assumption that per capita industrial emissions are the same across regions within a country.Per capita emissions among regions can differ widely in terms of the regions' levels of economic development and industrial activity or overall standards of living.Lamsal et al. (2013) found that in different urban areas worldwide, per capita emissions varied significantly because of differing energy consumption rates and energy production infrastructure.
For the transportation sector, road networks are widely used to spatially distribute on-road vehicle emissions based on the assumption that traffic volumes remain the same on different types of roads (e.g., Streets et al., 2003;Ohara et al., 2007).However, this is not true because of the existence of varied vehicle populations and road capacities.Commonly used road networks in the inventories above are extracted from the Digital Chart of the World (DCW) (DMA, 1993), which has not been updated since 1992.In this case, using DCW road networks to allocate on-road emissions in China may create significant biases in the spatial distributions of emissions because road construction has occurred continuously over the past 2 decades.
Based on our review of the spatial proxies used in bottomup emission inventories for China, it can be concluded that such spatial proxies are empirical and may have introduced considerable uncertainties into the spatial distributions of emissions.Studies on fossil fuel CO 2 inventories have made tremendous efforts to improve the spatial distribution of emissions based on satellite-derived nighttime light (Oda et al., 2011), fuel sales and traffic data (McDonald et al., 2014), multivariate regressions (Wang et al., 2013;Gately et al., 2015), or combined fossil fuel data assimilation systems (Rayner et al., 2010;Asefi-Najafabady et al., 2014), shedding new light on ways to improve the spatial distribution of air pollutant inventories.

Gridded NO x emission inventory
The bottom-up NO x emission inventory evaluated in this work is obtained from the Multi-resolution Emission Inventory for China (MEIC, http://www.meicmodel.org) for 2006.The MEIC inventory is developed using a technologybased methodology that estimates anthropogenic emissions in China from ∼ 700 emitting sources (Zhang et al., 2007(Zhang et al., , 2009;;Lei et al., 2011).Table 2 presents the 2006 anthropogenic NO x emissions estimated by the MEIC for China.
We then develop six sets of gridded NO x emission data using the same magnitude of emissions from the MEIC and different spatial proxies, as presented in Table 3.Many geographic information system (GIS) grid-based spatial proxies (e.g., population density and road networks) are at a resolution of 1 km × 1 km.In this work, emissions are first gridded at a resolution of 1 km × 1 km and then regridded to a resolution of 0.667 • long × 0.5 • lat to fit the GEOS-Chem model.The first two emission datasets evaluate two types of common spatial proxy maps: population distributions and nighttime lights.In the first gridded emission dataset (S1), all emissions are allocated based on population densities  2006).The Gridded Population of the World (GPW) population map is also frequently used to allocate emissions, but we do not include it in our analysis because the uncertainties introduced by differences in population maps are minor (Andres et al., 2016).In the second dataset (S2), all emissions are allocated based on the nighttime lights map drawn from the Defense Meteorological Satellite Program Operational Linescan System (DMSP-OSL) satellite (http://www.ngdc.noaa.gov/dmsp/download_rad_cal_96-97.html).These two scenarios, which involve a single type of spatial proxy for all sectors, are simplifications of the common practice utilized in current emission inventories.We use S1 as the base case and make modifications upon it to evaluate different spatial proxies used in different sectors.Figure 1 compares the spatial distributions of the total population and nighttime light data.The nighttime light data present more significant urban-rural gradients than the total population density data and, thus, may better represent differences in economic development levels between urban and rural areas.
Our previous study revealed that the locations of large point sources in emission inventories significantly affect the prediction accuracy of the chemical transport models (Wang et al., 2012).A third dataset (S3) is used to investigate the effects of using the exact locations of emissions from large point sources.S3 is based on S1 but consists of a unitbased power plant emission dataset including the locations of ∼ 6400 power generation units across China (Liu et al., 2015) to override power sector emissions.
The fourth dataset (S4) is based on S3 but uses DCW road networks to allocate on-road transportation emissions, which have been applied in several widely used regional emission inventories for China.In S4, emissions are allocated according to the road's length, neglecting the distinctions of road classes and capacities.An improved approach of allocating on-road transport emissions is implemented in the fifth dataset (S5).We use the county-level vehicle population as a spatial proxy to distribute provincial emissions of onroad transportation to each county (an administrative unit one level lower than city).The county-level vehicle population is simulated by the Gompertz function (V = V * × e αe βE ), where V and V * represent actual and saturation levels of to-  the current road conditions (Fig. 2c).Comparisons between county-level vehicle populations, the total population, and these two types of road networks are shown in Fig. 2d-f.Densely populated regions typically have larger vehicle populations and therefore greater road network demands.However, the DCW road network fails to provide detailed accounts of the roads in urban areas, and as a result, the emissions allocated to these regions are underestimated.The updated CDRM road networks can help resolve this shortcoming.
In the last dataset (S6), further modifications of S5 are made to better represent the spatial patterns of the industrial sector.We use the industrial gross domestic product (IGDP) as a first-step spatial proxy instead of population density.Figure 3 shows the correlations between the normalized industrial emissions and three factors (total population, urban  population, and IGDP) at the provincial level.IGDP is more closely correlated with emissions (R 2 = 0.72) than other factors, indicating that IGDP can better represent the spatial patterns of industrial activities than population density.Using population density as a spatial proxy to allocate industrial emissions assumes that per capita industrial emissions are the same across regions, which may result in underestimations for industrialized regions and overestimations for rural regions.In S6, provincial emissions are first distributed into counties according to the IGDP of each county, and then county-level emissions are allocated to grids based on the population densities drawn from LandScan.

GEOS-Chem model
The GEOS-Chem model is a global three-dimensional (3-D) model of atmospheric chemistry that includes > 80 species and > 300 reactions (Bey et al., 2001;Park et al., 2004).http://gmao.gsfc.nasa.gov/).Mixing in the planetary boundary layer follows a non-local scheme (Lin and McElroy, 2010) that improves upon previous assumptions of a fully mixed boundary layer.Convection occurs according to a modified relaxed Arakawa-Schubert scheme (Rienecker et al., 2008).
In this study, we use GEOS-Chem version 09-01-02 to simulate tropospheric NO 2 vertical columns over China for 2006.The EDGAR v3 emission inventory (Olivier and Berdowski, 2001) is used for global anthropogenic emissions, and the East-Southeast Asia region is replaced with the INTEX-B inventory (Zhang et al., 2009).We further override the anthropogenic NO x emission inventory for China using the six datasets described in Sect.2.2.To remove the effects of the initial concentration fields, a 1-year spin up is conducted.We use averaged summer (June, July, and August (JJA)) NO 2 columns for this evaluation because the short lifetime of NO 2 in the summer favors NO 2 column linkage with local NO x emissions.We average daily 2 h modeled tropospheric NO 2 vertical columns at a local time of 13:00-15:00 and sample the model at grids coincident with the daily satellite pixels used in the final average columns.
Grids in the nested-grid GEOS-Chem model are 0.667 • long × 0.5 • lat, and their areas range from 2500 to 4000 km 2 over eastern China, comparable to the mean size of a county (∼ 3000 km 2 ) in this region.Thus, the spatial pattern of emissions evaluated in this work can represent the spatial variations of emissions at the county level.To draw comparisons with other county-level indicators, gridded NO 2 column densities simulated using this model are resampled to county averages by area weights.In this work, a total of 2364 countylevel districts are covered, including both counties and municipal districts across China.

Satellite data
The satellite data used in this work come from the Ozone Monitoring Instrument (OMI) aboard the Aura satellite (Levelt et al., 2006).NO 2 slant column densities are derived using a differential optical absorption spectroscopy (DOAS) algorithm (Platt, 1994;Boersma et al., 2002;Bucsela et al., 2006).The tropospheric slant NO 2 column densities used in this work are drawn from the Dutch OMI NO 2 (DOMINO) product (version 2, collection 3) (Boersma et al., 2011) available from the Tropospheric Emission Monitoring Internet Service (TEMIS) (http://www.temis.nl/).The air mass factor (AMF) is a multiplicative factor used to convert slant columns into vertical columns (Palmer et al., 2001).The retrieved tropospheric vertical NO 2 column is sensitive to the NO 2 vertical profile used during the AMF calculation.Following Lamsal et al. (2010), we revise the AMF by replacing the original NO 2 profile with that generated from the nestedgrid GEOS-Chem model using S1-S6 emissions described above.The new NO 2 vertical profiles have a finer spatial resolution of 0.667 • long × 0.5 • lat compared to the original resolution of 3 • long × 2 • lat.Six scenario-specific OMI NO 2 products are generated for comparison with corresponding model results, and higher resolution prior profiles could reduce the bias by 3-6 % between modeled and satellite data for different scenarios.In this work, we restrict the use of OMI pixels to those at a solar zenith angle of ≤ 70 • and a cloud fraction of ≤ 0.3 in the final averaged columns.Pixels at swath edges (five pixels on each side) are rejected to reduce spatial averaging.Finally, each OMI pixel is allocated to 0.667 • long × 0.5 • lat grids by area weights with corner coordinate information to create daily tropospheric vertical NO 2 column maps.The retrieved NO 2 columns are also resampled from pixels to county averages to draw comparisons with indicators at the county level.

Results
The spatial distributions of tropospheric NO 2 vertical columns over China in the summer simulated from six gridded inventories are presented and compared with scenariospecific satellite-observed NO 2 vertical columns in Fig. 4. Because all the inventories in model simulations have the same emission totals, differences among S1-S6 reflect differences in the spatial allocations of the total emissions.In general, modeled and observed NO 2 vertical columns exhibit similar patterns but different fine structures.All modeled cases can reproduce highly polluted areas over the North China Plain and Yangtze River Delta, whereas many pollution hotspots in these regions are underestimated compared to the satellite data in S1.As discussed above, using the total population distribution as a spatial proxy may have misrepresented the urban-rural gradients of economic development levels and allocated disproportionately large fractions of emissions to rural regions.Consequently, urban emission hotspots may have been underestimated because in China the rural population exceeds the urban population.
Figure 5 compares modeled and satellite-retrieved tropospheric NO 2 vertical columns by county in China for the analyzed six cases.The first column in Fig. 5 compares the model and satellite data for all districts and counties in China for the summer of 2006.Modeled NO 2 columns are generally in good agreement with OMI NO 2 columns, with regression slopes varying from 0.78 to 1.01 and R 2 values varying from 0.75 to 0.86.Simulations obtained using S1 substantially underestimate NO 2 columns compared to satellitebased columns, especially in densely populated regions.Tropospheric NO 2 vertical columns simulated using S2 present more pollution hotspots compared to S1, particularly in economically developed regions, such as the capital cities in each province.Using nighttime lights as a spatial proxy (S2) instead of the total population can improve the model's performance.In this case, the slope and R 2 increase to 0.94 and 0.81, respectively, and the normalized mean bias (NMB) increases from −11.1 to 6.0 %.These results indicate that the nighttime light map may serve as a better indicator for NO x emissions than the total population because it can better represent a region's economic development level than the total population.
When using the exact positions of power plants instead of total population density for the power sector (S3 vs. S1), most hotspots in the simulated tropospheric NO 2 vertical columns are enhanced, and the discrepancies between the modeled and observed columns are reduced.Model simulations based on S3 correlate better with satellite observations (slope = 0.87 and R 2 = 0.83) than those based on S1, proving the importance of determining the positions of large point sources.NO 2 columns simulated using S4 have more bias than those generated using S3, mainly because of underestimations of on-road transportation emissions resulting from the use of outdated DCW road networks as a proxy, as discussed above.Model simulations based on S5 agree better with satellite-based NO 2 columns (slope = 0.95 and R 2 = 0.86) than those based on S3 and S4, which shows that using vehicle population and CDRM road networks can better represent transportation emissions.When using IGDP to constrain industrial emissions, modeled NO 2 vertical columns further improved (S6 vs. S5).Results indicate that improvements in the transportation sector have more significant effects on modeled NO 2 columns than those in the industrial sector.Finally, the simulations based on S6 exhibit the best performance for all six cases compared to satellite observations (slope = 1.01 and R 2 = 0.85).To further understand the biases in the model simulations, we divide all counties into three categories -counties in four municipalities (i.e., Beijing, Tianjin, Shanghai, and Chongqing), urban counties, and suburban counties according to administrative district definition -and compare the modeled and observed NO 2 columns for these three categories (Fig. 5).For the four Chinese municipalities studied, emission totals are allocated from cities to counties; however, for other provinces, emissions are allocated from province to counties.Municipalities are defined as a separate category in the following analysis.
Model underestimation in S1 mainly occurs for urban counties (Fig. 5c, slope = 0.53 and NMB = −20.3%).This approach, which uses the total population as a spatial proxy, assumes that the per capita emissions in different regions in a province are the same.However, because of varied industrial levels and economic patterns, per capita emissions can be very different across regions.According to Lamsal et al. (2013), the tropospheric NO 2 column is a power law scaling function of the population size, and the exponent is affected by regional differences in per capita emissions.Using population density as a spatial proxy in these areas significantly underestimates the emissions in urban areas, as shown in Fig. 5c.Using nighttime lights to allocate emissions can significantly improve the model's performance for urban areas (Fig. 5g, slope = 1.02 and NMB = −1.8%), although overestimations are identified in a few counties.This finding demonstrates the feasibility of using nighttime lights alone as a spatial proxy when more complex indicators are not available.After determining the exact positions of power plant emissions, the model simulations are substantially improved for all types of regions (Fig. 5j-l); however, the urban emissions are still underestimated (slope = 0.69 and NMB = −13.2%), possibly because of underestimations of the industrial and transportation emissions in urban regions when the population density is applied as a spatial proxy.For S4, the model performances are slightly worse than those for S3 for urban regions.In S4, the vehicle populations of different counties are assumed to be linearly correlated with the outdated DCW road networks, and as a result, the on-road transportation emissions in urban areas may be substantially underestimated.Using vehicle population and the updated road networks can significantly improve this issue (S5) and NMB decreases to −4.6 %.Finally, for S6, the model performances are further improved for urban regions.Thus, using IGDP and the updated road network as spatial proxies can better represent the emission sources for urban areas.
For the four municipalities studied, total populations and nighttime lights alone cannot effectively represent emission patterns because these proxies are concentrated in city centers, whereas large emission sources, such as power plants and industrial activities, have largely been relocated away from urban regions in municipalities.When the locations of power emissions are considered, the correlation between the modeled and observed NO 2 columns is significantly improved (Fig. 5j, n, and r).However, using the IGDP and updated road network as spatial proxies disproportionately concentrates emissions in urban areas, resulting in an overestimation of modeled NO 2 columns.
Figure 6 presents the distributions of ratios between simulated and satellite-based county-level NO 2 column densities for the six cases.We remove those counties with OMI NO 2 columns of less than 3 × 10 15 molecules cm −2 to avoid the influence of the background areas with more uncertain retrieved columns.After rejecting the background regions, ∼ 770 counties (33 %) covering much of eastern China remain.Model simulations based on S6 exhibit the best performance.Differences between the modeled and satellite-based NO 2 columns are within 20 % for 391 counties in S6 compared to 310 and 331 counties in S1 and S2, respectively.Model simulations of S1 present large negative biases compared to satellite-based NO 2 columns, with 119 counties underestimated by over 50 % (66 counties for S6).However, for S2, positive biases between the modeled and satellite-based NO 2 columns exceed 50 % for 28 counties (10 counties for S5), indicating that using nighttime light maps may overestimate urban emissions in certain regions.

Uncertainties
This work is subject to several uncertainties.Biases between model simulations and satellite data come not only from emission inventories but also from the model itself or satellite retrievals.Potential errors in nested-grid GEOS-Chem model simulations are compounded by errors in GEOS-5 meteorological fields, planetary boundary layer heights, and a variety of chemical parameters selected in a given model.Model simulation errors for eastern China are estimated to present a negative systematic bias of 10-20 % (season dependent) plus a random error of 30 % according to previous work (Martin et al., 2003;Lin and McElroy, 2011;Lin et al., 2012).Sensitivity simulations (Lin et al., 2012) of the model factors above show that none of them can fully explain the bias between model simulations and satellite observations.Combining all these modifications can achieve better agreement with satellite observations but cannot eliminate the negative biases associated with extremely polluted locations (Lin et al., 2012), suggesting that model errors are not the primary cause of model-satellite biases for urban areas.
The stated uncertainties of individual DOMINO v2.0 NO 2 column retrievals are estimated at 1.0 × 10 15 molecules cm −2 + 25 % (Boersma et al., 2011), which demonstrates the dominance of errors arising during the calculation of AMF in polluted areas (Boersma et al., 2007).In particular, DOMINO v2.0 OMI products do not explicitly account for the effects of aerosols on solar radiation, which are important for the calculation of AMF and particularly significant for eastern China because of its high aerosol loadings.In Lin et al. (2014), explicitly including aerosol scattering and absorption exerts either positive or negative effects on retrieved NO 2 , with a mean effect of 14 %.However, aerosol effects cannot fully explain the large discrepancies observed between model and satellite results for urban areas.
Biases in the satellite NO 2 data could affect the comparisons for different scenarios.Previous studies indicated that the OMI NO 2 columns could be biased high for 20 % over China (Irie et al., 2008;Lin et al., 2014).In this case, the best scenario of spatial proxies would be S4 instead of S6.If the OMI data are biased low instead of biased high, the best scenario will remain the same but with less agreement compared to satellite observations.However, given the high sen-sitivity of modeled NO 2 columns to spatial proxies, we can still conclude that the spatial proxies used in gridded emission inventories could affect the representation of bottom-up emission inventory significantly.
Other factors, such as the resolution, may also introduce uncertainty.The spatial proxies in S6 are quite good at the resolution of 0.667 • long × 0.5 • lat used in this work, which roughly corresponds to the county level in eastern China.However, they may not be suitable at other resolutions.Further work based on models with finer grids should be performed to explore appropriate spatial proxies at finer resolutions.Natural emissions from lightning and soil sources are not discussed in this work, although they are suggested to be underestimated by approximately 16 % for China for 2006; they account for less than 3 and 6 % of anthropogenic emissions, respectively, and even less in highly polluted regions (Lin, 2012).

Discussion
In this work, we use NO x emissions to relate the biases between model simulations and observations to local emissions and evaluate the impacts of spatial proxies on the distributions of bottom-up emission inventories at the county level using satellite constraints.Insight obtained from this work can be applied to other species generated from fossil fuel combustion (e.g., SO 2 and CO 2 ) because they typically come from the same sources.Our method represents a feasible approach to studying species that are difficult to validate directly because suitable observation data are not available and/or lifetimes are long.
As shown in Sect.2.1 and described in this work, regardless of how spatial proxies are adjusted in a bottom-up inventory, they are always empirical and contribute uncertainties to the spatial representation of emission inventories.A companion paper of this work found that large uncertainties existed in proxy-based inventories on the urban scale (less than 0.25 • ) by comparing the proxy-based inventory with an inventory developed from exact locations of emitting facilities in Hebei, China (Zheng et al., 2017).Critical evaluations must be conducted to ensure the accuracy of these proxies.Our work presents a practical means to diagnose this problem and involves using satellite observations as an indicator of ground emissions to determine the relationships between emissions and local parameters.Our approach also has the attribute of propagating information from satellite observations into bottom-up inventory developments.
The approach presented here can also be expanded to other regions because a universal relationship exists between emissions and the economy.Spatial proxies that work well for China may not be suitable for other regions because of regional differences in energy consumption, industrial development, and living standards, as demonstrated in Lamsal et al. (2013).However, by integrating local satellite observation data, a better understanding of the spatial distribution of emissions and their relationships to local parameters can be obtained, which has implications for the local selection of spatial proxies.

Concluding remarks
The spatial proxies used when developing gridded emission inventories are empirical and can introduce uncertainties in bottom-up emission inventories.This issue has rarely been evaluated.In this work, we evaluate the effects of spatial proxies on the representations of spatial distributions of emissions using an integrated framework of bottom-up emission inventories, a chemical transport model, and satellite observations.We first develop six sets of gridded NO x emissions for China using the same magnitude of emissions from the MEIC and different spatial proxies.The spatial proxies considered in this study include the following: total population, nighttime lights, locations of power plants, IGDP, vehicle populations, and two different road network datasets.The nested-grid GEOS-Chem model is then used to simulate tropospheric NO 2 vertical columns using the six gridded emissions, and modeled NO 2 columns are compared to satellite-based NO 2 columns derived from OMI data.
We found that the spatial proxies used in gridded emission inventories significantly affect simulated NO 2 columns.The model performance is largely dependent on the representations of urban emissions in the bottom-up inventory, which are very sensitive to spatial proxies.Using the total population density tends to allocate more emissions to rural areas and to underestimate NO 2 columns compared to satellite observations.Nighttime lights represent urban emissions better than population density because they correlate more closely with economic development levels.When using sophisticated combinations of different proxies to represent urban emissions (i.e., positions of large point sources, IGDP, vehicle populations, and the most recent road network), modeled NO 2 columns agree better with satellite observations, indicating that improving the spatial representation of emissions could significantly increase the accuracy of emission inventories.
The results of this work emphasize the importance of spatial proxies for bottom-up emission inventory development.Discrepancies between models and observations should be attributed to not only errors in the magnitude of total emission estimates but also spatial proxies.Although the selection of spatial proxies in this work is still empirical and may not represent the best case, we illustrate methods for improving gridded emission inventories by carefully selecting spatial proxies.This study provides a framework to apply information from satellite observations to inform bottom-up inventory development.The approach used here could be further extended to other species and regions, and more advanced optimized approaches could be introduced into the development of emission inventories of different air polluwww.atmos-chem-phys.net/17/4131/2017/Atmos.Chem.Phys., 17, 4131-4145, 2017tants (Asefi-Najafabady et al., 2014;Gately et al., 2013;Nassar et al., 2013).More efforts should be made to improve the spatial distributions of bottom-up emission inventories in the future.
Data availability.The emission inventory data used in this work are available at http://www.meicmodel.org(Tsinghua University, 2013).Research data are available upon request to the corresponding author Qiang Zhang (qiangzhang@tsinghua.edu.cn).
Competing interests.The authors declare that they have no conflict of interest.

Figure 1 .
Figure 1.Spatial patterns of (a) the total population density and (b) nighttime lights for eastern China and (c) the distributions of total population density and nighttime lights in China at a resolution of 0.1 • × 0.1 • .

Figure 2 .
Figure 2. Comparisons between the total population (left column), DCW road networks (middle column), and CDRM road networks (right column).(a-c) Maps of the total population and two road networks in the Beijing-Tianjin region as an example.(d-f) Comparison with county-level vehicle populations.

Figure 3 .
Figure 3. Correlations between normalized provincial industrial emissions and three types of spatial proxies: (a) total population, (b) urban population, and (c) IGDP data.
The nested-grid GEOS-Chem model developed by Chen et al. (2009) is used in this work.It has a horizontal resolution of 0.667 • long × 0.5 • lat with 47 vertical layers and a nested-grid domain that covers China and most of its neighboring countries (70-150 • E, 11 • S-55 • N).The global model, which has a spatial resolution of 2.5 • long × 2 • lat, provides time-varying boundary conditions via the one-way nested approach.Both global and nested simulations are driven by the 3-D meteorological fields of GEOS5 assimi-lated by the Goddard Earth Observing System (GEOS) at the NASA Global Modeling and Assimilation Office (GMAO;

Figure 4 .
Figure 4. Spatial distributions of summer averaged tropospheric NO 2 vertical columns modeled by GEOS-Chem based on S1-S6 emissions and scenario-specific data retrieved from the OMI for 2006.

Figure 5 .
Figure 5. Comparisons between county-level model simulations from the six emission cases and OMI NO 2 vertical columns for all counties (first column), urban districts in municipalities (second column), urban districts in other cities (third column), and other counties (fourth column).The color of each symbol corresponds to the population density in the county specified by that symbol.The dotted line has a slope of 1.

Figure 6 .
Figure 6.Distributions of the ratios between the county-level model simulations and satellite observations from the six emission inventories.The shaded region indicates a range of 0.9-1.1.

Table 2 .
Anthropogenic NO x emissions by sector in China for 2006 from the MEIC inventory.

Table 3 .
Spatial proxies used in the gridding process for the six emission scenarios developed in this study.
a TP: total population from the LandScan population database (ORNL, 2006).b NL: nighttime light from the DMSP-OSL satellite (http://www.ngdc.noaa.gov/dmsp/download_rad_cal_96-97.html).c IGDP: industrial GDP (China Statistical Yearbook, National Bureau of Statistics, 2007).d DCW: road networks from the Digital chart of the world (DMA, 1993).e VP: vehicle population (Zheng et al., 2014).f PS: coordinates of point sources (Liu et al., 2015).g CDRM: road networks from the China Digital Road-network Map developed by the National Administration of Surveying, Mapping and Geoinformation of China.