Interactive comment on “Evaluating BC and NO x emission inventories for the Paris region from MEGAPOLI aircraft measurements”

Abstract. High uncertainties affect black carbon (BC) emissions, and, despite its important impact on air pollution and climate, very few BC emissions evaluations are found in the literature. This paper presents a novel approach, based on airborne measurements across the Paris, France, plume, developed in order to evaluate BC and NOx emissions at the scale of a whole agglomeration. The methodology consists in integrating, for each transect, across the plume observed and simulated concentrations above background. This allows for several error sources (e.g., representativeness, chemistry, plume lateral dispersion) to be minimized in the model used. The procedure is applied with the CHIMERE chemistry-transport model to three inventories – the EMEP inventory and the so-called TNO and TNO-MP inventories – over the month of July 2009. Various systematic uncertainty sources both in the model (e.g., boundary layer height, vertical mixing, deposition) and in observations (e.g., BC nature) are discussed and quantified, notably through sensitivity tests. Large uncertainty values are determined in our results, which limits the usefulness of the method to rather strongly erroneous emission inventories. A statistically significant (but moderate) overestimation is obtained for the TNO BC emissions and the EMEP and TNO-MP NOx emissions, as well as for the BC / NOx emission ratio in TNO-MP. The benefit of the airborne approach is discussed through a comparison with the BC / NOx ratio at a ground site in Paris, which additionally suggests a spatially heterogeneous error in BC emissions over the agglomeration.


Introduction
Knowledge on pollutant emissions is a key element in the field of air pollution.It provides essential information on the contribution of various source sectors to pollutant levels, which is required for targeting emission reduction measures.Emission inventories are necessary input to chemistrytransport models (CTMs) which are important tools for atmospheric research and air quality management.
Among the various emitted species, black carbon (BC) aerosol is a chemical compound of major importance.In air quality, it highly contributes to the health risk (Peng et al., 2009) related to fine particulate matter (PM 2.5 , particulate matter with aerodynamic diameter below 2.5 µm).It also Published by Copernicus Publications on behalf of the European Geosciences Union.plays a crucial role in the Earth's climate through the scattering and the absorption of incoming solar radiation and the subsequent change in planetary albedo (direct effect) (Schulz et al., 2006;Yu et al., 2006), and the modification of cloud properties as BC when coated with hydrophilic species acts as cloud condensation nuclei (indirect effect) (Lohmann and Feichter, 2005).The overall industrial era BC radiative forcing (including direct, semi-direct and indirect effects, as well as albedo change due to deposition on snow) is estimated at +1.1 W m −2 , ranking second between the carbon dioxide (CO 2 , +1.56 W m −2 ) and methane (CH 4 , +0.86 W m −2 ) forcing (Bond et al., 2013).
However, high uncertainties still affect BC emission inventories, making the true forcing uncertain.As a product of incomplete combustion processes, BC emissions at the global scale mainly originate from energy-related combustion (e.g., on-and off-road vehicles in transport areas, biofuel and coal in residential areas) and open burning (savannas and forest fires) (Bond et al., 2004(Bond et al., , 2013;;Junker and Liousse, 2008;Lamarque et al., 2010).Global BC emissions have most recently been estimated to be 7.5 Tg yr −1 , with an uncertainty range of 2-29 Tg yr −1 , of which 4.8 Tg yr −1 originates from energy-related combustion (1.2-15 Tg yr −1 ) (Bond et al., 2013).Most values given in the literature are included in this large (factor of 10) range.Granier et al. (2011) compared inventories over the past decades at different scales.In 2005, the ratio of the highest over the lowest global BC emissions estimate is only 1.28, but higher values are given at the regional scale in western and central Europe, with ratios of 1.34 and 1.76, respectively.Uncertainties arise from emission factors (usually highly dependent on the conditions of use and the type of equipment), activity data and spatial distribution for some source sectors (e.g., wood burning for heating).By analyzing a large number of source profiles, Chow et al. (2011) found highly variable BC contents in different PM 2.5 emission factors, in the range of 6-37 % for on-road light-duty gasoline engine exhausts, 33-74 % for on-road heavy-duty diesel engine exhausts, 29-61 % for tire wear, 6-13 % for agricultural burning, 4-33 % for residential wood combustion, and 3-14 % for oil combustion stationary sources.Dallmann and Harley (2010) quantified uncertainties in PM 2.5 emission factors for various mobile sources, such as on-road gasoline (±45 %) and on-road diesel (±59 %) sources.However, the authors estimated a much lower uncertainty in fuel consumption, around ±3 and ±5 % for on-road gasoline and diesel vehicles, respectively.In the evaluation of the BC climate impact, these uncertainties on emissions contribute -among other uncertainty sources such as microphysical interactions in clouds or removal processes -to a large 95 % confidence interval on the BC radiative forcing, between +0.17 and +2.1 W m −2 (Bond et al., 2013).It is worthwhile noting that deducing local/regional-scale BC emission uncertainties (for instance over the Paris megacity) from those at the global scale appears to be difficult.Indeed, if on the one hand uncertainties usually increase when considering smaller domains (because of uncertainties in spatial distribution of emissions), on the other hand some uncertainty sources relevant at the global scale may be reduced at the European scale (e.g., minor contribution of highly uncertain open burning emissions, better constrained local activity data).This is notably the case of a megacity like Paris, located in a post-industrial country where the most uncertain sources have a low contribution.
Compared to BC, many more efforts have been made to assess NO x emissions, which in turn appear better constrained.In the inventories intercomparison of Granier et al. (2011), the ratios between the highest and lowest NO x emissions are 1.15 at the global scale and 1.18 and 1.23 in western and central Europe, respectively.Due to real-time measurements, large point source emissions are expected to be reasonably estimated in countries with mandatory emission monitoring.Concerning the traffic source, which dominates the overall emissions (particularly in summer), uncertainties are still substantial.In bottom-up inventories, these emissions are usually estimated with traffic emission models of various types and complexities.Most uncertainties arise from both activity data and emission factors.Activity data are difficult to estimate as the fleet and its technological characteristics (e.g., Euro standards) are perpetually evolving.Concerning emission factors, various techniques are available, both under controlled conditions and in the real world, and difficulties come up with their combination (see Franco et al., 2013, for a review).Compared to uncertainties previously given for PM 2.5 emission factors from Dallmann and Harley (2010), uncertainties for NO x emission factors are significantly lower, with values around ±27 and ±22 % for on-road gasoline and on-road diesel sources, respectively.Smit et al. (2010) reviewed results from 50 studies dealing with the validation of traffic emission models and pointed out a tendency to overestimate NO x emissions regardless the validation techniques (e.g., tunnel, on-board or ambient concentration studies) or the model type (e.g., average speed, traffic situation or modal model) employed.In their critical evaluation of on-road vehicle emission inventories over the United States, Parrish (2006) indicated rather accurate NO x inventories in the 1990s but decreasing NO x emission estimates during the last decade, in contradiction with the tendency inferred from the evolution of NO x concentrations.Additionally, in Monte Carlo analysis, where uncertainties on NO x emissions are often considered, the 2σ uncertainty ranges fixed for traffic NO x emissions, usually taken from the literature and/or expert judgments, vary substantially in the literature: ±80 % (regardless of source) in Deguillaume et al. (2007) (deduced from the ±40 % 1σ uncertainty given in the paper), ±100 % for area sources (and ±50 % for point sources) in Hanna et al. (2001) (also used in Tian et al., 2010), and ±50 % for area sources (and ±3 % for point sources) in Napelenok et al. (2011).
The evaluation of these inventories still remains a critical point since emissions are generally not directly measurable.
The use of CTMs for direct comparisons between measured and simulated concentrations is mostly inadequate for drawing precise conclusions on emission inventories because concentrations measured at a receptor point cannot be unambiguously linked to emissions at a point aloft because of mixing processes and chemical transformations.In addition, CTMs and the meteorological input data they are using have their own uncertainties.Different alternative approaches have thus been developed.Concerning the BC emissions, useful information may be gained from their evaluation relative to emissions of another compound for which uncertainty in emissions is expected to be smaller.For example, Zhou et al. (2009) derived BC emissions in two Chinese megacities from CO emissions and measured BC / CO ratios at sites 10-15 km downwind of the cities.However, given the large size of these megacities, the measurement representativeness for the city emissions remains an open question.Other methods specifically assess transport patterns of emissions to the receptor point.Xu et al. (2013) developed a method based on in situ measurements and backward trajectory analyses to evaluate BC emissions over the North China Plain.A promising approach consists in using inverse modeling techniques, which were widely applied to NO x emissions using variational data assimilation (Mendoza-Dominguez and Russell, 2000), Bayesian Monte Carlo approaches (Deguillaume et al., 2007;Konovalov et al., 2008), and Kalman filter approaches (Napelenok et al., 2008;Gilliland and Abbitt, 2001).NO 2 columns retrieved by satellites (e.g., from GOME, SCIAMACHY, OMI, GOME-2) provide a valuable observational basis for many of these works because of their large data coverage.For BC, the only study using variational data assimilation we are aware of is that of Hakami et al. (2005), which aims at better constraining BC emissions over East Asia from in situ measurements.
The aim of this paper is to evaluate emission inventories at the scale of a large city.In this context, it presents an original methodology based on airborne measurements in the city plume and chemistry-transport simulations.It is applied to BC and NO x emission inventories over the Paris megacity, using the CHIMERE model.Observations used in this study were obtained in an intensive campaign that took place in and around the city in July 2009 in the framework of the MEGAPOLI European project (Megacity: Emissions, urban, regional and Global Atmospheric POLlution and climate effect, and Integrated tools for assessment and mitigation; www.megapoli.info).In particular, our study relies on airborne BC measurements in the city plume, trying thus to alleviate problems of the representativeness of ground-based in situ measurements.In Sect.2, the general methodology is described.All input data, including measurement data, emission inventories, and the CHIMERE model, are described in Sect.3. Results from both ground and airborne measurements are discussed and compared in Sect. 4.

Methodology
The method developed in this study aims at evaluating, at the scale of a large city, emission inventories of species that can be traced at the scale of a few hours, i.e., either a chemically inert (at the timescale considered) single compound or a conservative family of products all originating from a unique primary compound.The method is based on airborne measurements of such species in the megacity plume during the afternoon in a well-mixed convective boundary layer (BL); thus the vertical mixing can be considered as rather well established and consequently the measured concentrations at a particular altitude as representative of concentrations in the whole BL.
A CTM simulation, using the inventory to be evaluated, is used to simulate tracer concentrations in the plume.For both observations and simulations, along the flight path perpendicular to the plume, tracer concentrations above regional background and within the pollution plume can be integrated.The ratio of the simulated area over the measured area corresponds to a spatially averaged emission error factor (EEF) for the agglomeration for each flight.To achieve such a calculation, the plume needs to be well distinguishable from the background, which requires large enough local emissions in the city and a rather homogeneous background.
This method aims at reducing the influence of some errors in the CTM.By considering integrated peak areas over lateral transects across the plume, it allows for the effect of some potential errors in the structure of the simulated plume to be minimized (e.g., any error in lateral dispersion, reasonable errors in wind direction) and consequently focus more on emissions.However, several potential error sources still remain, and therefore need to be carefully investigated: (i) the wind speed, which directly determines the temporal window of emissions sampled during the flight; (ii) the degree of vertical mixing, which determines the representativeness of the airborne measured concentrations; (iii) the wet and dry deposition of the tracer, which can lead to discrepancies in the emissions factors if not well simulated by the model; and (iv) the boundary layer height, which directly affects the level of concentrations.These points will all be discussed in the next sections.
The methodology is applied in this paper to BC and NO x emissions.As a chemically inert compound, BC can be directly used as a tracer.For the NO x compounds, as they undergo many fast chemical reactions, the NO y family gathering all reactive nitrogen species (e.g., NO x , NO 3 , HNO 3 , HONO, N 2 O 5 , PAN) appears more conservative at the timescale of a plume and is thus used as a tracer of NO x emissions.

Measurement database
In the framework of the EU FP7 MEGAPOLI project (Baklanov et al., 2010), two 1-month intensive campaigns (July 2009 and January/February 2010) were organized in the greater Paris area to better characterize organic aerosol in a large megacity.The study presented here is based on observations obtained during the summer campaign.Petzold et al. (2013) recently made some recommendations about the use of the term "BC" for black carbon, distinguishing various terminologies depending on the property used in the measurement technique: (i) the light absorbing coefficient σ ap , or equivalent BC (EBC) if the mass-specific absorption coefficient (MAC) is indicated, for instruments based on light absorption; (ii) refractory BC (rBC) for instruments based on refractory properties; and (iii) elemental carbon (EC) for instruments focusing on the chemical composition or the carbon content based on thermo-optical methods.So far in this paper, the term BC has been employed as a qualitative and commonly used term.In this section, the terminology of Petzold et al. (2013) is used.
Ground measurements of light absorption coefficient, EC and NO x were performed at the LHVP (Laboratoire d'Hygiène de la Ville de Paris) station (48.829 • N, 2.359 • E), an urban background site in the center of Paris.EC concentrations are provided by an OCEC Sunset Laboratory field instrument at a resolution of 1 h.The light absorption coefficient (at a measured wavelength of 637 nm, different from the instrument nominal wavelength of 670 nm) is measured by a multi-angle absorption photometer (MAAP, model 5012, Thermo Scientific ® ) at a 5 min resolution, and converted into EBC with an MAC of 8.8 m 2 g −1 derived from a linear regression between MAAP and OCEC Sunset measurements at the LHVP site (R 2 = 0.88, N = 533; see Sect.S1 in the Supplement, for details).NO x observations come from a chemiluminescence monitor (AC31M, Environment SA) equipped with molybdenum oxide converters.However, an efficient conversion of many other nitrogen-containing compounds (NO z = NO y − NO x ) by the molybdenum converter can lead to interferences in the NO x concentrations (Dunlea et al., 2007).This positive artifact varies from one location to the other, depending on the relative contribution of NO z compounds in the NO y family.Dunlea et al. (2007) estimated a mean overestimation of +22 % in Mexico City.
Among the chemical data available in the Paris plume, NO y and EBC airborne measurements aboard the French ATR-42 aircraft have been used (see Freney et al., 2014, for a detailed description of the aircraft campaign).Measurements are available for several days in July: 1, 9, 10, 13, 15 (only EBC that day), 16, 20 (only NO y ), 21, 25, 28 (only EBC), and 29 (only NO y ).NO y concentrations were measured at a 30 s time resolution with an instrument designed for airborne measurements that consists of three Eco Physics (CLD 780 TH) analyzers in which NO is measured using ozone chemiluminescence.NO 2 is photolytically converted, and NO y is converted with H 2 in a gold-covered heated oven (see Freney et al., 2014, for details).The limit of detection is 10 pptv.NO, NO 2 and NO y measurement uncertainties have been estimated by these latter authors to 10, 20 and 20 %, respectively.The measured NO y includes the following species: NO, NO 2 , HNO 2 , HNO 3 , HO 2 NO 2 , N 2 O 5 , PAN, PPN and particulate nitrate.EBC particles are collected with a 50 % passing efficiency aerodynamic diameter of 5.0 µm (McNaughton et al., 2007), but most soot particles are likely in the fine mode.The light absorption coefficient is measured with a 60 s time resolution from the light absorption coefficient at 650 nm provided with a particle/soot absorption photometer (PSAP) instrument (Radiance Research ® ), corrected as in Bond et al. (1999) (see Sect.S2 in the Supplement for details).PSAP absorption coefficient measurement uncertainties are around 20-30 % (Bond et al., 1999;Virkkula et al., 2005).Absorption values are then converted into EBC concentrations using the MAC of 8.8 m 2 g −1 already used at the LHVP site.The uncertainty related to the MAC is discussed in Sect. 4.3.4.Various physical parameters are also measured in the ATR-42 aircraft at a 1 s time resolution including wind speed, wind direction and position of the aircraft (longitude, latitude, height).BL height (BLH) estimations are available at the SIRTA (Site Instrumental de Recherche par Télédétection Atmosphérique) (48.712 • N, 2.208 • E) (suburban background site at about 20 km in southwest of Paris) and LHVP stations.At SIRTA, they are estimated from ALS450 Leosphere backscatter lidar data at a 5 min time resolution (Haeffelin et al., 2012).At LHVP, values are estimated from CL31 ceilometer data (Haeffelin et al., 2012).Traditional meteorological parameters (wind, temperature) are also measured at SIRTA, where, additionally, Leosphere wind cube lidar measurements are also available, providing wind measurements at a 10 min time resolution, each 20 m from 40 to 200 m above ground level (a.g.l.).

Emission inventories
Three European anthropogenic emission inventories are evaluated in this paper, all referring to year 2005: 1.The EMEP inventory (Vestreng et al., 2007), with a longitude-latitude resolution of 0.5 • × 0.5  et al. (2012).The inventory is the year 2005 base case and also serves as a starting point for the inventory described next.In this paper it is referred to as the TNO inventory.
3. A third inventory based on the TNO inventory and with the same 1/8 • × 1/16 • longitude-latitude resolution but incorporating bottom-up emission data over four European metropolitan regions (Paris, London, Rhine-Ruhr and Po Valley).The city emission inventories were compiled by local authorities responsible for city emission inventories and air quality such as Airparif for Paris (Airparif, 2010).It is described in more detail by Denier van der Gon et al. (2011), and has been previously used in Zhang et al. (2013) and Timmermans et al. (2013).It will be referred to as the TNO-MP (MP for MEGAPOLI) inventory.
The same EC / OC speciation table, primarily associated with the TNO inventory, is applied in all inventories.Sectordependent factors used to derive July emissions from annual ones are reported in Table S2 in the Supplement.This table also shows the total sector-wise BC July emissions over the region for the three inventories.
The resolution of both the TNO and TNO-MP inventory is considerably improved compared to the EMEP inventory.Despite its coarse spatial resolution, the comparison of this latter inventory with the two other refined ones remains relevant for several reasons: (i) before being applied to simulations, emissions are downscaled to the air quality model resolution, here to a 3 km horizontal resolution, using the 1 × 1 km resolved GLCF (Global Land Cover Facility) land use database (Hansen et al., 2000;Hansen and Reed, 2000), and (ii) concentrations are considered in the Paris plume, i.e., at a rather large spatial scale, which decreases the influence of such a coarse resolution in the emissions.Emissions are apportioned according to several types of land use: urban, rural, forest, crops and maritime (Menut et al., 2013).Because of their better horizontal resolution, the evaluation of emission inventories in this paper focuses mainly on the TNO and the TNO-MP inventories, and the EMEP inventory is taken as an additional reference emission inventory, as it is used in many studies in Europe.
The spatial distribution of BC and NO x emissions in the Paris region during a typical working day in July is given in Fig. 1 for each inventory.In order to illustrate the important differences in the spatial distribution of emissions between inventories, one can compute, for all inventories, the mean emissions of all cells within a certain distance around the LHVP site, from 0 (only the LHVP cell) to 80 km (the whole region) (see Fig. S3 in the Supplement).Relative to TNO-MP, the EMEP inventory BC and NO x emissions in the center of Paris are relatively low but increase further away.The coarse resolution and the effect of the previously mentioned emission downscaling are clearly visible in Fig. 1 for EMEP emissions, and lead to obvious discontinuities between original cells.At a large scale, it gives the highest emissions for both compounds (and more particularly for NO x emissions).Conversely, TNO and TNO-MP inventories display significantly higher emissions in the agglomeration center, and lower ones further away.In summer, the TNO-MP inventory shows a quite similar spatial emission distribution to the TNO one.In particular, major highways around Paris, spatially unresolved or missing in the EMEP inventory, are clearly visible thanks to the refined resolution.However, BC emissions in TNO-MP are considerably lower than in TNO in the agglomeration.In absolute terms, discrepancies between both inventories mainly originate from road transport (SNAP sector 7) and residential/tertiary (SNAP 2) sources, and to a lesser extent from waste disposal (SNAP 9), non-road transport (SNAP 8) and industrial process (SNAP 4) sources (see Table S2 in the Supplement for details).However, the highest relative discrepancies (which exceed a factor of 3) are associated with SNAP 2, 8 and 9.For sources that are considered areas in the top-down TNO inventory (e.g., SNAP 2 and 9), they are likely due to the distribution proxies used to downscale national totals, leading to too high emissions in Paris, where the population density is very high (Denier van der Gon et al., 2010).Concerning the SNAP 4 point sources, discrepancies can probably be explained by the use of generic capacity rules in TNO, rather than exact emissions in the TNO-MP.Both inventories are equivalent outside this region.A quite similar pattern is given for NO x emissions, except that discrepancies between both inventories in Paris are greatly reduced.In terms of BC / NO x emission ratios, highest values are given by the TNO inventory, followed by the EMEP one, and finally the TNO-MP one.Differences are maximum in the center of Paris, and decrease when integrating over larger areas.
It is worthwhile noting that, as previously mentioned, both the PSAP and the MAAP instrument are based on the measurement of the light absorption, and observations should thus be referred to as EBC.However, as emission factors and source profiles used to build emission inventories are mostly expressed as EC (Vignati et al., 2010;Chow et al., 2011;H. A. C. Denier van der Gon, personal communication, 2011), the simulated "BC" should be regarded as EC.An ambiguity therefore arises from comparisons between observed EBC and modeled EC since they correspond to different quantities.This point is discussed in Sect.S1 in the Supplement and in Sect.4.3.4.In the following, the term BC will be kept for convenience for both observations and simulations.

CHIMERE model description
In this paper, all simulations are performed with the CHIMERE CTM (Schmidt et al., 2001;Bessagnet et al., 2009;Menut et al., 2013) (www.lmd.polytechnique.fr/chimere).The model was originally designed to provide The CHIMERE model allows for simulation of transport, gas-phase chemistry, some aqueous-phase reactions, and size-dependent aerosol species, including secondary organic aerosol, dry and wet deposition.It treats coagulation, absorption and nucleation aerosol processes.Inorganic aerosol thermodynamic equilibrium is calculated using the ISORROPIA model (Nenes et al., 1998).

Model configuration and simulated cases
In this paper, simulations are performed during the summer MEGAPOLI campaign (July 2009) with a 5-day spinup period.Two nested domains of increasing resolution -CONT3 (0.5 • × 0.5 • , i.e., ∼ 50 × 50 km, 67 × 46 cells) and MEG3 (0.04 • × 0.027 • , 120 × 120 cells) -are considered (see Fig. S4 in the Supplement).The choice of the domains was previously explained in Zhang et al. (2013).The domain is subdivided into eight vertical layers, from the ground to more than 5000 m height, with vertical resolution decreasing with altitude.The first three layers have a depth of about 40, 70 and 110 m, respectively.
Boundary and initial conditions are taken from LMDz-INCA2 global model for gaseous species and LMDz-AERO for particulate species (Hauglustaine et al., 2004;Folberth et al., 2006).The model uses the previously described anthropogenic emission inventories, while biogenic emissions are computed with MEGAN data and parameterizations from Guenther et al. (2006).In order to investigate the influence of meteorology on results, two meteorological data set are considered.The first has been produced with PSU/NCAR mesoscale meteorological model (MM5; Dudhia, 1993), performed over three nested domains with increasing resolutions of 45, 15 and 5 km, respectively, and using Global Forecast System (GFS) data from the National Center for Environmental Prediction (NCEP) as boundary conditions and large-scale data.The second one has been produced with the Weather Research and Forecasting model (WRF; Skamarock et al., 2005; www.wrf-model.org) for the same domains and resolutions.Note also that MM5 and WRF have distinct boundary layer schemes: Medium Range Forecast (MRF) for the first, and Yonsei University (YSU) for the second.

Results and discussion
In this section, we first evaluate meteorological input data (Sect.4.1).A first simple approach is then applied to evaluate BC emissions against NO x ones, based on ground-based measurements at the urban background LHVP site in Paris (approach no. 1, Sect.4.2).We then describe the procedure to evaluate BC emissions based on airborne measurements in the Paris plume, and present the corresponding results (approach no. 2, Sect.4.3).We finally discuss discrepancies between both methods (Sect.4.4).
Statistical metrics are defined as follows: -Mean bias: -Normalized mean bias: -Root-mean-square error: -Normalized root-mean-square error: In the above, m i and o i are the modeled and observed concentrations at time i, respectively, and m and o their averages over the period.

Evaluation of meteorological data
In this section, meteorological input data used in CHIMERE simulations, with both MM5 and WRF models, are evaluated against observations at the surface and in altitude.

Surface observations
Figure 2 shows comparisons between observations and simulations for meteorological parameters obtained at the SIRTA ground site.Statistical results are reported in Table 1, considering all hours as well as only the 06:00-14:00 UTC time period (referred to hereafter as morning hours); these are more relevant in our methodology since transport from the urban emission sources to the aircraft location occurs in the morning and the early afternoon.Except for the first days of a continental northeasterly wind regime, the period is dominated by an oceanic regime with west and southwest winds.The MM5 model shows a negative bias of −0.87 • C for ground temperature (reduced to −0.45 • C by considering only morning hours), and a positive bias for wind speed (+33 %).BLH appears strongly underestimated, with a bias of −34 %, reduced to −24 % during morning hours.Satisfactory correlations are found for temperature (due to diurnal cycle) and wind speed (R around 0.8-0.9),but lower ones are obtained for BLH (around 0.5).Conversely, the WRF model shows better results on temperature (now slightly overestimated, with a bias of +0.45 • C) and overall BLH with an underestimation reduced to −17 % (and −12 % during morning hours).Correlations on this latter parameter are significantly improved compared to the MM5 model (0.7 against 0.5).As one of the factors contributing to the BLH negative bias, diurnal profile comparisons (see Fig. S5 in the Supplement) show that the transition from a convective to a stable BL in the evening hours occurs much too early in the MM5 model, particularly at the LHVP site.This shift carries on with WRF but is seriously reduced, which explains the better correlations.

Observations in altitude
Wind lidar observations are compared with simulated wind speed in the first model layers (the first vertical layers in CHIMERE are at 43, 118 and 248 m a.g.l.) (see Fig. S6 in the Supplement).Due to a low vertical resolution in simulations, comparisons remain qualitative.The MM5 and WRF models show quite similar patterns, but the MM5 model tends to give a higher wind speed at all levels, including at ground level.Statistical results over the 06:00-14:00 UTC time window in the 110-210 m altitude range and for the flight days are reported in Table S3 in the Supplement.On average, low negative biases and reasonable NRMSE are obtained with both the MM5 and WRF model.At the daily scale, biases on wind speed remain below ±30 %, except on 13, 16 and 28 July (29 July), during which one or both models give high underestimations (overestimations), up to −46 % (+27 % with MM5).Errors in terms of NRMSE exceed 25 % for all these dates, as well as during 21 July (above 26 %), despite a low bias (error compensation between an underestimation during the first hours and overestimation during the last hours).
Wind speed simulation results are much better along the aircraft path (not shown), all biases remaining below ±20 % while NRMSE range between 12 and 32 %.The highest biases occur on 1 and 28 July, around −18 %.From a general point of view, the moderate positive bias found in wind speed at ground level vanishes in altitude except at mid-altitude during specific days, leading to a noticeable decrease if the NRMSE.

Approach no. 1: emissions evaluation from surface measurements
BC emissions can be first evaluated at ground level relative to those of NO x by assuming that both concentrations are proportional to their emissions close to their sources.Urban background BC and NO x concentrations, their ratio and their diurnal profiles are presented in Fig. 3, considering only flight days.Details on the evaluation of CHIMERE against observations and statistical results are given in Sect.S3 in the Supplement.Briefly, BC is strongly overestimated, in particular with the TNO inventory and during BL transitions; the use of WRF reduces biases mainly during the late afternoon (after 18:00 UTC).NO x is also overestimated, but mainly during the end of the day.Observed BC / NO x ratios are rather constant (0.06 µg m −3 ppb −1 on average) in July, except during some nights, but with a diurnal pattern showing lower values around 05:00 UTC and higher ones around midnight.CHIMERE also simulates rather constant ratios but with a positive bias for TNO and to a lesser extent for EMEP inventories, while the bias for TNO-MP emissions is rather small (< 13 %).We now evaluate, in further detail, BC emissions relative to NO x ones.In order to compare emission evaluation results between the two approaches, only flight days are considered, but some results over the whole month of July will also be indicated.It is worthwhile noting that, even if both BC and NO x are mainly locally emitted within the Paris agglomeration, biases may be partly related to errors in advected contributions: BC can be transported from outside (like on 1 July), while some NO x may be advected during the night and the early morning (when its photolytic conversion into HNO 3 or HONO is less active) or released by reservoir species (e.g., PAN).NO x measurements at two rural background stations in the south and southeast of Paris are available from the AIRPARIF network.On average, the NO x regional background roughly accounts for 15-25 % and 15-30 % of the levels in Paris for observations and simulations, respectively.From 1-year measurements during the PARTICULES campaign (Bressi et al., 2013), it has been found that the BC regional background contributes to about a third to the annual BC urban background average in Paris, and that this fraction is probably underestimated by the CHIMERE model (Petetin et al., 2014).However, this uncertainty source remains difficult to quantify more precisely.Additionally, NO x chemiluminescence measurements may also include some NO z compounds (Dunlea et al., 2007), but this is not likely a large error source since the CHIMERE model gives an average NO x / NO y ratio above 92 % at the LHVP site.
Given all these elements, we thus consider the BC / NO x ratio over the 05:00-08:00 UTC time window, corresponding to rush hours, when fresh NO x and BC are expected to dominate.BC vs. NO x concentrations during that time window are represented in Fig. 4. Simulated slopes of BC vs. NO x reported in Table 2 show a high overestimation with respect to observed ones for the TNO inventory, around a factor of 4. Overestimations are reduced to a factor of 2.8 and 2.2 for EMEP and TNO-MP inventories, respectively.Uncertainties in emission error factors (at a 95 % confidence interval) are quite similar for all inventories, around 18 %, since they essentially originate from the uncertainty on the slope deduced from observations (i.e., BC / NO x ratios are more variable in observations than in simulations).Note also that discrepancies between BC vs. NO x slopes and BC / NO x ratios (for the latter, biases remain below +136 %) are due to the diurnal variability in the measured BC / NO x ratio, which shows the lowest values during the 05:00-08:00 UTC time window.Results finally indicate an overestimation of BC emissions relative to NO x emissions, particularly in both top-down inventories (EMEP and TNO), which is significantly reduced with local bottom-up information integrated in the TNO-MP inventory.It is worthwhile noting that, as a burning processrelated species of long lifetime, carbon monoxide is another appropriate candidate for the evaluation of BC emissions (Zhou et al., 2009).However, it should also be mentioned that, due to its significant background concentrations, higher uncertainties (compared to NO x ) may arise from errors in the simulation of the regional background around Paris, even considering only rush hours.

Approach no. 2: emission evaluation from airborne measurements
Given these first results obtained at ground level, the alternative approach based on airborne measurements in the plume is now presented.The procedure is first described in detail, and the results are then shown and their uncertainties discussed.

Procedure to compute emission error factors (EEFs)
As an illustration, the TNO/MM5 case for two flights on 10 and 13 July is considered.Aircraft trajectories and BC concentrations during these days are presented in Fig. 5.As previously mentioned, the inlet used to collect BC particles is characterized by a 50 % passing efficiency aerodynamic diameter of 5.0 µm, and BC measurements are thus compared to the simulated BC concentration below 5 µm.Time series given in Fig. 6 show a series of peaks that correspond to suc-cessive crossings of the plume (time series for all July flights are given in Fig. S7 in the Supplement).In both observations and simulations, the Paris region plume is well distinguishable against background, and peaks can thus be located on the trajectory, giving the approximate central line of the plume.Errors in the simulated wind direction lead to a shift in the spatial localization of the plume (e.g., 13 July).It is worthwhile noting that some plumes from other cities may sometimes be sampled by the aircraft.The case of 13 July is notable: slight increases in BC concentrations in the western part of the flight track (after 13:00 UTC) correspond to plumes from Rouen and Le Havre, two industrial cities.Concentration variations at the end of the flight correspond to a vertical profile up to 3 km a.g.l.performed by the aircraft.In this study we focus on the time period during which the aircraft altitude is rather constant (about 600 m a.g.l.).As briefly described in Sect.2, the methodology consists in computing for each transect the plume integral of concentrations above background, this latter being estimated in both model and observations as the 30th percentile of concentra- Given that the aircraft does not exactly cross the plume perpendicularly, but with an angle α between 0 and 90 • that may be different in simulations compared to the real world, a correction factor of sin(α) is thus computed and applied to each peak area.Considering that atmospheric diffusion theories predict a linear relationship between point source emissions and concentrations in the plume, an EEF is finally defined for each flight as the ratio of the simulated area to the observed one (i.e., an error factor of 2 means an emission overestimation of 100 %).
The EEF is finally defined as It is worthwhile noting that such an evaluation applies to the combination of (i) the PM emission inventory, (ii) the PM speciation into BC, (iii) the monthly emission factor for July and (iv) the hourly emission profile.If uncertainties are expected to be larger on the two first elements, the two others may also contribute to the errors.In the following discussion, it should be kept in mind that the reference to BC emissions aggregates all these elements.

Results on EEFs
BC and NO x EEFs are given for each flight in Fig. 7. Average results and confidence intervals (at a 95 % confidence interval, i.e., at 2σ ) are also reported, considering errors as multiplicative: -Mean: -Confidence interval on the mean:

BC and NO x emission evaluation
Mean BC emissions results show considerable contrast between inventories and suggest on average a slight overestimation of the EMEP inventory (+9 % with MM5 data), a large overestimation of the TNO inventory (+45 %) and an underestimation of the TNO-MP inventory (−18 %).Results on NO x inventories show an overestimation ranging between +29 and +39 % depending on the inventory.As previously mentioned, NO y measurements may include a part of nitrate aerosols, but including them in the model has a very slight influence on results (NO x mean error factor changes remain below 11 % for all inventories).Despite some discrepancies on specific days between MM5 and WRF results, both give rather similar average emission biases.However, due to the strong variability from one day to the other, uncertainties on these average values are high for all inventories and both species, with a factor of about 1.39-1.47for MM5 cases (at a 95 % confidence interval) but reduced to 1.27-1.31for WRF simulations.
www  S4 in the Supplement).On average, ratios appear underestimated in EMEP and TNO-MP inventories (−18 and −46 %, respectively), while a very low EEF (+7 %) is found for the TNO inventory.However, the day-to-day variability remains as high as that of BC and NO x taken individually, with an uncertainty in the mean around a factor 1.31-1.41.In particular, rather small BC / NO x error factor ratios are obtained on 1, 13 and 21 July (and 25 July with MM5 meteorology) compared to the other days.Such a high day-to-day variability both in individual compounds EEF and in their ratio was not expected, which raises the question of its origin: does it come from the real-world emissions (missing in the model emission input data), or is it induced by uncertainties in the methodology, or both?In the next subsection, the variability potentially associated with observations themselves is discussed, while the variability that may come from the methodology (e.g., model errors) will be investigated in Sect.4.3.4.

Variability in observations
When investigating ratios of the BC area over NO y area for observations and simulations separately (see Fig. S9 in the Supplement), one can notice that, regardless of inventory, simulated BC / NO x ratios remain rather constant from one flight to another, leading to small uncertainties in the average value.Conversely, ratios derived from observations are much more variable, with higher values on 1, 13 and 21 July compared to other days (ratios above 0.15 compared to 0.06-0.09,i.e., close to a factor of 2), which induces uncertainties in the BC / NO x average value on the same order of magnitude as those obtained on BC and NO x separately.The model fails to reproduce such an enhancement during those days.The reasons for such an increase are not clear, but we can discuss some possible sources of variability.

Regional background heterogeneity
On 1 July, BC measurements around Paris (and notably upwind of the city) show rather high but noisy concentrations (see Fig. S10 in the Supplement), which suggests a possible heterogeneity in the BC regional background.In our methodology, a unique regional background value is estimated, based on the whole flight.In the case of a rather slender BC plume coming from the north in the direction of Paris and adding itself to the city plume, our procedure would thus not be able to discriminate both.This may explain the high BC / NO y ratio observed on that day.

Time window of emission sampling
Another possible source of variability in the BC / NO x emissions is related to the time window of emission sampling, as BC / NO x diurnal profiles at LHVP show lower values during morning rush hours than at the end of the morning (∼ 0.04 compared to ∼ 0.07 µg m −3 ppb −1 ; see Fig. 3), with a noticeable day-to-day variability (see Fig. S11 in the Supplement).The Paris plume sampled by the plane in the early afternoon at a distance up to 100 km from the city center originates from prior emissions, over different time windows depending on the wind speed.In order to assess which emissions are sampled during each flight, a new simulation case is run with the MM5 meteorology during July with 16 tracer compounds emitted each hour in a cell in the center of Paris, from 00:00 to 16:00 UTC.These inert compounds are only advected and deposited on the ground.By interpolating their concentration along each flight path, it is possible to compute the contribution of emissions at a specific hour to the overall plume.Figure 9a gives an illustration for 28 July, for which observed wind speed at higher levels (110-210 m a.g.l.) is among the lowest (3.8 m s −1 ; see Table S3 in the Supplement).Tracer emissions follow a workday emission profile.Early morning emissions of the day (in cold colors) dominate the two last peaks, while the latter emission contribution (in green and hot colors) progressively increases in earlier peaks.The contribution of emissions at a specific hour is given by the integral ratio of the associated tracer concentration over the total concentration (black area in the figure).On this flight, the aircraft thus samples emissions over quite a large time window (00:00-11:00 UTC), with main contributions originating from 05:00-07:00 UTC emissions (which account for 50 % of the total area).The procedure is repeated for each flight, and contribution results for all flights are presented in Fig. 9b, colored according to their average wind speed in altitude (Table S3 in the Supplement).Due to the large daily wind speed variability, sampling is quite different from one flight to the next.Largest windows (00:00-11:00 UTC) are sampled during the 13, 16 and 28 July flights, for which wind speed remains quite low.Most of the other flights (1, 9, 10, 20, 25 July) with intermediate wind speed have a sample window around 06:00-11:00 UTC, while the For each flight there is a corresponding line colored according to the mean wind speed in altitude given in Table S3 in the Supplement.Note that, as flights occur in the afternoon, only the first 16 h of the day are represented.
strongest wind speeds occurring on 15 and 21 July lead to sampling of 09:00-12:00 UTC emissions.As only late emissions are sampled on 21 July, this may explain higher ratios obtained in the plume.Unfortunately, no NO y measurements are available on 15 July with similar high wind speeds to confirm such a tendency.However, that explanation does not apply to the 13 July flight (large window), for which the high ratio thus remains unexplained.Note that the use of WRF is not expected to substantially modify the results obtained here with MM5 since major discrepancies between both meteorological model outputs only concern the BLH starting from 10:00 UTC and that emission tracers are here investigated relative to each other.

Uncertainties in the inversion methodology
The methodology used to evaluate NO x and BC emission inventories based on aircraft data over the Paris region intends to minimize several error sources: (i) the representativeness error by considering concentrations in the plume rather than at ground level, (ii) modeled chemistry errors by considering inert tracer species/families, and (iii) lateral dispersion and plume direction errors by considering integrated concentrations.The high EEF day-to-day variability previously noticed is partly due to a variability in Paris agglomeration emissions, but such large discrepancies are not expected and are indicative of other uncertainty sources that must be at stake, among which are (i) the wind field errors and their impact on emissions really sampled by the plane, (ii) the errors in BLH and vertical mixing, (iii) the errors in deposition, and finally (iv) the discrepancies between EBC and EC.All these uncertainty sources are investigated in this section.Overall EEF uncertainties are then discussed in Sect.4.3.5.

Wind speed and emission profiles
The methodology does not evaluate annual monthly emissions alone but rather also a part of the applied diurnal emission profiles (see Sect. 4.3.3)and errors on wind speed may shift the time window over which emissions are sampled (in simulations with respect to reality).This causes an additional uncertainty to be all the more important as the time window is narrow and temporal emission gradients are strong.In addition, wind speed errors within the city directly determine the residence time of air masses close to emission sources and thus the degree of pollutant accumulation.
Significant wind speed NRMSE at the SIRTA site, both at ground level and at altitude levels below 200 m a.g.l.
(around 40-60 % and 10-60 %, respectively), have been found (Sect.4.1).These errors influence the accumulation of emitted pollutants within the city, for which uncertainties are thus probably quite important, as the accumulation time is at first order inversely proportional to the wind speed.Thus, this uncertainty in the local wind speed appears as an important source of uncertainty and variability in the day-to-day EEFs.However, biases in the wind speed are reasonable, for example mostly below ±30 % for the wind speed at SIRTA between 100 and 200 m a.g.l., thus indicating no particular bias in EEFs due to this error source.
Another uncertainty source is related to wind speed errors at higher altitudes (between the agglomeration and the measurement location) and subsequent errors on the plume advection.Given the diurnal profile of emissions and the variable emission time window sampled by the plane depending on the wind speed (Sect.4.3.3),an error in advection would shift this time window toward earlier (later) emissions in the case of negative (positive) biases on wind speed in altitude.This error source thus appears all the more important that the gradient in the diurnal emission profile is high in the sampled time window.Daily biases on wind speed below ±20 % were found in airborne measurements along the flight path.If these uncertainties are taken as representative of the average wind between the emission source and the aircraft, the typical displacements of the emission time window are less than about 1 h.This would induce significant errors (say above 10 %) only for time windows between 05:00 and 08:00 UTC, when the temporal gradient in emissions is strong.Thus this error source should not be of major importance to explain the variability in results.

Vertical mixing
As previously highlighted, aircraft measurements are expected to have a higher spatial representativeness than at ground level, but this relies on the assumption that the vertical mixing in the BL is correctly established, so that observations obtained in the plane, at an altitude of about 600 m a.g.l., can be considered representative of those in the whole plume.The vertical heterogeneity is expected to be the highest above the city and to decrease gradually along the plume due to the turbulent mixing and the absence (or the relatively poor contribution) of fresh emissions at ground level outside the city.The vertical turbulent mixing parameterization in the CHIMERE model follows the K-diffusion approach of Troen and Mahrt (1986) without a counter-gradient term.Vertical fluxes are directly proportional to the vertical turbulent diffusivity coefficient (K z ) that is bounded in the model by a minimum value of 0.01 and 1 m 2 s −1 in the dry and cloudy BL, respectively, and by a maximum value of 500 m 2 s −1 (Menut et al., 2013).To assess the influence of vertical mixing on results, a sensitivity test is performed by multiplying and dividing K z values by 2, as in Vautard et al. (2003).Both the dry minimum and the maximum boundaries are kept in the sensitivity test.Relative changes are shown in Fig. S12 in the Supplement.Dividing (multiplying) the K z by two leads to a moderate increase below +19 % (a decrease below −16 %) for BC and NO x , while the BC / NO x ratio does not change by more than 6 %.Such small changes are quite consistent with the results obtained over the Paris agglomeration by Vautard et al. (2003), who explain the moderate impact of K z on concentrations in altitude by the fact that a larger diffusivity increases both the incoming vertical flux from lower layers and the outgoing one toward higher layers.

Boundary layer height (BLH)
The BLH is the other important parameter that requires correct modeling, since it determines the volume into which the emissions will be diluted within the plume.During early afternoon, lidar observations at the SIRTA and LHVP sites showed an underestimation by the MM5 model, while significant improvements are obtained with the WRF model but still with a negative bias at the SIRTA suburban site (Sect.4.1).If such an underestimation exists in the whole flight region, it may lead to an overestimation of emission biases.However, processes are not linear, since the increased concentrations due to a lower BLH may, for instance, be reduced by a higher dry deposition (which depends on concentrations in lowest level).In order to assess the importance of these errors, a sensitivity test is performed with the EMEP/MM5 case by increasing the BLH by 30 % (corresponding to the mean bias between 06:00 and 14:00 UTC).So far, simulated cases have been performed with prognostic turbulent parameters (i.e., directly taken from meteorological models).However, as the diffusivity coefficient depends on the BLH, the sensitivity test with BLH multiplied by 130 % is performed with the diagnostic option, in which K z is calculated within the CTM as a function of the BLH.Except for some specific dates (10, 20 and 28 July), a larger BLH leads to lower concentrations and therefore decreases error factors (see relative changes in Fig. S12 in the Supplement).On average, changes are around −14 % for both BC and NO x and have no influence on the mean BC / NO x ratio (−1 %).Thus, the uncertainty in BLH could both contribute to the variability and bias in BC and NO x EEFs, while the BC / NO x ratio is rather unaffected by these errors.

Deposition
BC and NO y are expected to be conservative at the timescale of the flight, but they both undergo deposition.Errors in the simulated deposition and/or in the NO y speciation (given the large differences of deposition rates among NO y individual compounds) may impact emission biases results.Meteorological conditions indicate that wet removal is likely to be negligible over the campaign region, and the deposition is thus essentially dry.In order to assess the influence of deposition on results, a sensitivity test based on the EMEP/MM5 case is performed without any dry or wet deposition.Relative changes in BC, NO x error factors and their ratios are reported in Fig. S13 in the Supplement.Removing deposition increases all error factors by various amounts depending on the day.On average, error factor changes on BC and NO x are around +7 and +16 %, respectively.Without deposition, the BC / NO x error factor ratio is decreased by −9 % on average.These figures are upper limits, as errors in deposition speed are most probably less than 100 %.Thus, uncertainty in deposition likely does not very much affect the error budget.

Mass-specific absorption coefficient (MAC)
As previously mentioned, an additional uncertainty may arise from the comparison between EBC (observations) and EC (emissions and simulations) through the MAC value used to convert absorption coefficients into EBC concentrations.Airborne PSAP EBC concentrations have been obtained considering a constant MAC of 8.8 m 2 g −1 deduced from measurements at the LHVP site in Paris (see Sect.S1 in the Supplement).The relevancy of comparisons performed in this study with the simulated EC thus relies on the hypothesis that this MAC determined in the center of Paris is valid at the scale of the whole agglomeration, and that it remains constant along the flight.This is supported by the MAC value estimated in winter 2009 by Sciare et al. (2011) at a suburban site at 20 km in the southwest of Paris, which remains in the same order of magnitude as the one obtained here in the center of Paris (7.3 ± 0.1 m 2 g −1 ).Indeed, during that winter season, but 1 year later, single-particle aerosol time-of-flight mass spectrometer observations performed at the LHVP site during the MEGAPOLI winter campaign showed a majority of already internally mixed BC particles (with a shell of organic material and secondary inorganic compounds) (Healy et al., 2012).Therefore, the MAC variations along the flight are expected to be reasonable.This is also supported by the analysis of BC / NO y ratios obtained from aircraft observations that does not show any significant increase with distance from Paris, which would be expected if the MAC value increased with distance from the emission source.Additionally, it is worthwhile noting that direct measurements of the MAC enhancement by Cappa et al. (2012) recently showed a very low enhancement between near-source and more distant values, by around only +6 %, onboard a ship along the California coast (CalNex campaign) and at a ground site located 14 km from Sacramento (Carbonaceous Aerosols and Radiative Effects Study (CARES) campaign).Considering the previous MAC estimations in the Paris region -7.3 and 12.0 m 2 g −1 by Sciare et al. (2011) and Liousse et al. (1993), respectively -the uncertainty associated with our MAC value (8.8 m 2 g −1 ) is roughly estimated to be 30 %.

Statistical significance of the results
Results obtained for each compound in Sect.4.3.2consist of mean error factors and rather confidence intervals that result from (i) uncertainties associated with the day-to-day variability which is not included in the model input data (beyond the temporal dependence on the month and the day of the week), (ii) measurement uncertainties and (iii) uncertainties in the methodology (conditioned by error sources in the model).
The first type of uncertainties is difficult to quantify but can be reasonably considered random.Also, measurement uncertainties are probably mostly random, but they may include a part of systematic uncertainties.In order to be conservative, they are assumed to be entirely systematic.Uncertainties in the methodology have been discussed in previous sections, notably through various sensitivity tests on deposition, boundary layer height and the turbulence diffusivity coefficient.Results have shown that all investigated uncertainties in the model influence mean EEFs, as well as their variability.They have therefore a systematic and a random part, which we could estimate in the previous sensitivity tests.These tests have shown a significant day-to-day variability, which suggests that these uncertainties are probably partly random and may thus explain most of the day-to-day variability obtained in the first results (Sect.4.3.2).It appears to be rather tricky (and uncertain) to explain all discrepancies between individual flight results on a quantitative basis, notably due to the fact that several uncertainty sources are potentially combined.In spite of that, the choice is made to replace the uncertainty obtained in Sect.4.3.2 by a combination of all the systematic uncertainties estimated in the previous subsection.Results of individual and the derived overall systematic uncertainty are reported in Table 3, as well as final confidence intervals on our estimation of EEFs.
Confidence intervals (at a 95 % confidence interval) on average emission error biases deduced from the overall uncertainty are reported in Table 4.For NO x emissions, positive biases are found in all inventories.Considering the 95 % confidence intervals, the bias in the TNO inventory appears statistically insignificant, which may not be the case in both the EMEP and TNO-MP inventory, for which a slight overestimation remains probable (due to a confidence interval lower bound of −4 %, and thus very close to zero).These are in the range of the 35 % agreement found for NO x emissions in Paris during the ESQUIF project in summer 1999 by Vautard et al. (2003), also based on airborne measurements and CHIMERE simulations but using an alternative method and an older emission inventory prepared by AIR-PARIF.Through an inverse modeling exercise based on satellite NO 2 columns, Konovalov et al. (2006) obtained a similar 30 % overestimation of the EMEP inventory in the Paris area.Through another inverse emission modeling based on ground measurements over the Paris region, Deguillaume et al. (2007) found no significative bias, but an uncertainty of about ±20 %.Note that these studies were performed using different emission inventories.Given the uncertainties, the NO x emissions positive bias around 20-30 % found here in most of the inventories does not appear to be significant.
Also, neither the positive bias (around +12 %) of EMEP nor the negative one (around −23 %) of TNO-MP BC emissions are significant, while the 95 % confidence interval of the TNO inventory indicates a probable overestimation of BC emissions in that inventory.As previously mentioned in Sect.3.2, the overestimation of BC emissions in the TNO inventory can probably be explained by the spatial distribution procedure that concentrates too large emissions in the city.For example, using population density as a proxy implies the assumption of constant per capita emissions over the country, which might lead to an overestimation of urban BC emissions as discussed in Timmermans et al. (2013) and references therein.At this stage, it should be emphasized that discrepancies between EMEP and TNO BC results are mostly related to differences in their spatial resolution since input data for national totals are similar.Accordingly, Paris region total emissions in July are quite similar in both the EMEP and TNO inventory (as shown in Table S2 in the Supplement).However, at the scale of the Paris agglomeration, total emissions in both inventories do show discrepancies, with TNO emissions being more concentrated in the city due to their finer resolution, while the EMEP emissions spill over in rural areas of Paris region due to their coarser resolution (see Sect. 3.2).Therefore, in our opinion, the better results obtained with EMEP need to be interpreted with caution.Potential errors in the distribution of BC emissions are partly avoided in the TNO-MP inventory, which follows a more (but not fully) bottom-up approach.Concerning the BC / NO x emission ratio, the only statistically significant negative bias  concerns the TNO-MP inventory, while results for both the EMEP and TNO inventory suggest error compensation in BC and NO x emissions, leading to a satisfactory estimation of the BC / NO x emission ratio.It is worthwhile reiterating that, in this study, the same BC speciation table (primarily built for the TNO inventory) has been used in all inventories in order to be consistent, but the use of a more specific speciation to the Paris region would possibly change these results, in particular for the TNO-MP inventory, in which a part of the bottom-up information is lost through the use of a constant BC speciation in this study.Another point to be mentioned concerns the emissions' interannual variability, which adds an additional uncertainty (similar for all inventories) due to the comparison of observations from 2009 with inventories built for 2005.
To our knowledge, an evaluation of BC emissions as presented here has not yet been attempted at the scale of a large megacity, and uncertainties estimated at the global or regional scale are difficult to extrapolate to an agglomeration.For comparison, through their adjoint inverse modeling exercise over Asia, Hakami et al. (2005) found quite consistent total assimilated and base case BC emissions over Asia, but they underlined higher discrepancies at regional scale, with major errors over Japan and northern and southern China of about a factor ± 2.

Surface vs. airborne results: representativeness issues
Results obtained at ground level in Paris show a high overestimation of the BC / NO x ratio in the TNO (∼ factor of 4) and EMEP (∼ factor of 3) inventories, and to a lesser extent in TNO-MP (∼ factor of 2).This is not consistent with results obtained in the plume where the BC / NO x emission ratio appears highly underestimated in TNO-MP (while er-rors are lower for EMEP and TNO).There are several factors that may at least partly explain these discrepancies between ground and airborne results.The main factor is probably the difference of representativeness between both approaches, with ground concentrations being influenced by emissions in the vicinity of the LHVP station, while concentrations in the plume integrate emissions at a much larger scale (the whole agglomeration).In order to assess the LHVP site representativeness, a simulation with spatially traced emissions around that site is performed over a few days (see Sect.S4 in the Supplement).LHVP concentrations appear mainly influenced by close emissions, with a contribution of 50-85 % from emissions within 6 km around the site.Conversely, beyond a distance of 21 km (which still covers the agglomeration), emissions contribute less than 10 %.These contributions are quite variable depending on the wind field, with the importance of close emissions increasing with stagnant conditions.Since BC and NO x emissions as well as their ratio are highly heterogeneous over the whole Paris region (see Sect. 3.2 and Fig. S3 in the Supplement), results obtained at the LHVP site thus cannot be representative of the whole agglomeration but probably only its central part.
Additionally, the previous tracer experiment takes into account neither the sub-grid emissions heterogeneity at a resolution of 3 × 3 km (e.g., a park and a stretch of the Paris ring road are included in the LHVP cell) nor the sub-cell processes, caused by the high complexity of urban environments (e.g., street canyons, building-induced turbulence).The LHVP spatial representativeness may thus be even lower.Working at the plume scale strongly reduces these limitations since (i) all emissions within the agglomeration end up in the plume and (ii) mixing during a few hours of transport from the source regions to the measurement locations is expected to significantly increase the concentration representativeness.
This would therefore suggest that the best BC / NO x emission ratio is given by the TNO-MP inventory in the center of Paris, while it highly underestimates the ratio at the scale of the whole agglomeration, in contrast to the TNO inventory, which gives better results.Compared to TNO-MP, NO x emissions in TNO are quite similar, while BC ones are higher and more concentrated in the center of Paris.

Conclusion
Black carbon (BC) emissions are still highly uncertain, and very few studies have attempted to evaluate their inventories.This paper presents an original approach, based on airborne measurements across the Paris plume, developed in order to evaluate BC and NO x emissions at the scale of the whole agglomeration.It is applied to three emission inventories (EMEP, TNO, TNO-MP).In order to assess the benefit of such a methodology, BC / NO x ratios at the LHVP ground site in Paris are first investigated.Over the whole of July, these ratios show a significant (at a 95 % confidence interval) overestimation in all inventories with biases ranging between a factor of 2 in TNO-MP and a factor of 4 in TNO.On average, results obtained from July airborne observations give an overestimation of NO x emissions around +20-30 % for all inventories, a moderate bias around +12 and −23 % for EMEP and TNO-MP BC emissions, respectively, but a higher positive bias of +40 % for TNO BC inventory.However, these results present an unexpectedly high day-to-day variability (up to a factor of about 3).Low biases are also obtained on BC / NO x emission ratio for EMEP and TNO inventories (−18 and +13 %, respectively), in contrast with the TNO-MP inventory, which shows an underestimation of −44 %.
Various uncertainty sources in the methodology are investigated through sensitivity tests -wind field errors, boundary layer height, vertical mixing, deposition, and BC nature (equivalent BC vs. elemental carbon) -and are likely to explain this variability.Results of these tests are used to derive a systematic uncertainty between 35 and 48 % in EEFs.This suggests that a moderate overestimation of NO x July emissions in the EMEP and TNO-MP inventories is statistically probable.Biases found in EMEP and TNO-MP BC emissions are not significant.However, the overestimation in TNO BC emissions does appear to be as significant.It is probably due to the distribution proxies used to downscale national total emissions that concentrate too large emissions in a highly populated area such as Paris with lower per capita emissions compared to rural areas.The BC / NO x emission ratio appears underestimated in the TNO-MP inventory, while non-significant biases are obtained with both the EMEP and TNO inventory.While discrepancies between the EMEP and TNO inventory are likely due to differences in spatial resolutions and allocation, the ones between TNO and TNO-MP illustrate the distinction between bottom-up and top-down approaches.Results obtained at a ground-based site in Paris are not consistent with those obtained in the plume, due to the fact that surface measurements are only representative of an area within a few kilometers around the LHVP site, while emissions from the whole agglomeration are sampled in the Paris plume.
Finally, best estimations of BC and NO x emission biases thus do not exceed ±40 %, which appears to be rather moderate considering the numerous uncertainties at stake in the construction of an inventory.Due to methodological uncertainties on the same order of magnitude, assessing the significance of all these results remains difficult.However, the methodology does succeed in highlighting some statistically significant biases, and, in particular for BC or NO x emissions or the BC / NO x ratio, at least one of the three inventories has been proven to be very likely biased.It is worthwhile noting that the methodology used in this study not only evaluates an inventory by itself but also a particulate matter speciation table and a temporal disaggregation (monthly and diurnal), which are also subject to potential errors.
To our knowledge, this study is one of the most comprehensive ones to evaluate BC emissions at the scale of a megacity.The comparison of aircraft-and ground-based results has given an interesting insight into the potential error compensation in the spatial allocation of BC emissions over a large agglomeration.In the framework of the PRIME-QUAL PREQUALIF project, a dense BC network of 14 stations (of various typologies, e.g., rural, urban, traffic) has been installed over the Paris region.It will allow for a better characterization of the BC spatial distribution over the agglomeration, and in the line with this, an interesting prospect would thus be to compare it to the simulated spatial distribution constrained by emission inventories.
The Supplement related to this article is available online at doi:10.5194/acp-15-9799-2015-supplement.

H.
Petetin et al.: Evaluating BC and NO x emission inventories for the Paris region 3 Input data

Figure 1 .
Figure 1.BC (left panels) and NO x (right panels) emissions in EMEP, TNO and TNO-MP inventories.

Figure 3 .
Figure 3. BC and NO x concentrations and BC / NO x ratio at the LHVP urban background site during July flight dates (left panels) and associated diurnal profiles (right panels).

Figure 4 .
Figure 4. Observed and simulated BC vs. NO x concentrations at LHVP between 05:00 and 08:00 UTC considering July flight days, and linear fits (lines).Only simulations with WRF meteorological data are reported here.

Figure 5 .
Figure 5. Observed (along the aircraft trajectory) and modeled (in the background, with the TNO-MM5 case) BC concentration for 10 (left panel) and 13 July (right panel).Paris and some other large cities are indicated.Simulated concentrations shown here are taken at 13:00 UTC on the fourth layer, which roughly corresponds to 470-870 m height.The solid black line corresponds to the flight path outside that layer (altitude above 870 or below 470 m).

Figure 6 .
Figure 6.Observed (in black) and simulated BC concentrations along the aircraft trajectory for 10 (left panel) and 13 July (right panel).

Figure 7 .
Figure 7. BC (top panel) and NO x (bottom panel) emission error factors (EEFs) for each individual flight (when available) and averaged EEF (on the right) at the 95 % confidence interval, for all six simulated cases (logarithmic scale).
et al.: Evaluating BC and NO x emission inventories for the Paris region BC / NO x emission ratio evaluation Ratios of BC EEFs over NO x ones are shown in Fig. 8 (results considering all flights are reported in Table

Figure 8 .
Figure 8. BC / NO x EEF ratio for each individual flight and averaged for all flights (on the right) with confidence interval, for all six simulated cases (logarithmic scale).

Figure 9 .
Figure 9. (a) Concentration along the flight trajectory of the 16 tracers (colored lines) and their total (black) during the 28 July flight.The emission profile gives the color associated with each tracer as well as their emission intensity.The hourly emissions contribution over all peaks of each tracer to the total (in terms of area) is shown in the top right corner.(b) Hourly emission contribution to total area for all flights.For each flight there is a corresponding line colored according to the mean wind speed in altitude given in TableS3in the Supplement.Note that, as flights occur in the afternoon, only the first 16 h of the day are represented.

Table 1 .
Statistical results of MM5 (and WRF in parentheses) considering all July hours and only the 06:00-14:00 UTC time window (N represents the proportion of available data).

Table 3 .
Systematic 2σ uncertainties on BC and NO x error factors and BC / NO x error factor ratio from various sources, and associated confidence intervals on average emission error biases for the three inventories.

Table 4 .
Confidence intervals on average emission error biases for the three inventories.