Evaluating BC and NOx emission inventories for the Paris

20  High uncertainties affect black carbon (BC) emissions and, despite its important impact on air 21  pollution and climate, very few BC emissions evaluations are found in the literature. This paper 22  presents a novel approach, based on airborne measurements across the Paris, France, plume, 23  developed in order to evaluate BC and NOx emissions at the scale of a whole agglomeration. The 24  methodology consists in integrating, for each transect, across the plume observed and simulated 25  concentrations above background. This allows minimizing several error sources in the used model 26  (e.g. representativeness, chemistry, plume lateral dispersion). The procedure is applied with the 27  CHIMERE chemistry-transport model to three inventories — the EMEP inventory, and the so-called 28  TNO and TNO-MP inventories — over the month of July 2009. Various systematic uncertainty 29  sources both in the model (e.g. boundary layer height, vertical mixing, deposition) and in 30  observations (e.g. BC nature) are discussed and quantified, notably through sensitivity tests. Large 31  uncertainty values are determined in our results, which limits the usefulness of the method to rather 32  strongly erroneous emission inventories. A statistically significant (but moderate) overestimation is 33  obtained on the TNO BC emissions and on EMEP and TNO-MP NOx emissions, as well as on the 1  BC/NOx emission ratio in TNO-MP. The benefit of the airborne approach is discussed through a 2  comparison with the BC/NOx ratio at a ground site in Paris, which additionally suggests a spatially 3  heterogeneous error in BC emissions over the agglomeration. 4 

presents a novel approach, based on airborne measurements across the Paris, France, plume, 23 developed in order to evaluate BC and NO x emissions at the scale of a whole agglomeration. The 24 methodology consists in integrating, for each transect, across the plume observed and simulated 25 concentrations above background. This allows minimizing several error sources in the used model 26 (e.g. representativeness, chemistry, plume lateral dispersion). The procedure is applied with the 27 CHIMERE chemistry-transport model to three inventories -the EMEP inventory, and the so-called 28 TNO and TNO-MP inventories -over the month of July 2009. Various systematic uncertainty 29 sources both in the model (e.g. boundary layer height, vertical mixing, deposition) and in 30 observations (e.g. BC nature) are discussed and quantified, notably through sensitivity tests. Large 31 uncertainty values are determined in our results, which limits the usefulness of the method to rather 32 strongly erroneous emission inventories. A statistically significant (but moderate) overestimation is

Introduction 5
Knowledge on pollutant emissions is a key element in the field of air pollution. It provides essential 6 information on the contribution of various source sectors to pollutant levels, which is required for 7 targeting emission reduction measures. Emission inventories are necessary input to chemistry-8 transport models (CTMs) which are important tools for atmospheric research and air quality 9 management. 10 Among the various emitted species, black carbon (BC) aerosol is a chemical compound of major 11 importance. In air quality, it highly contributes to the health risk (Peng et al., 2009) related to fine 12 particulate matter (PM 2.5 , particulate matter with aerodynamic diameter below 2.5 µm). It also plays a 13 crucial role in the Earth's climate through the scattering and the absorption of incoming solar 14 radiation and the subsequent change in planetary albedo (direct effect) (Schulz et al., 2006;Yu et al., 15 2006) and the modification of cloud properties as BC when coated with hydrophilic species acts as 16 cloud condensation nuclei (indirect effect) (Lohmann and Feichter, 2005). The overall industrial era 17 BC radiative forcing (including direct, semi-direct and indirect effects, as well as albedo change due 18 to deposition on snow) is estimated at +1.1 W m -2 , ranking second between the carbon dioxide 19 (CO 2 , +1.56 W m -2 ) and methane (CH 4 , +0.86 W m -2 ) forcing (Bond et al., 2013). 20 However, high uncertainties still affect BC emission inventories making the true forcing uncertain. global BC emissions estimate is only 1.28, but higher values are given at the regional scale in 30 Western and Central Europe, with ratios of 1.34 and 1.76, respectively. Uncertainties arise from 31 emission factors (usually highly dependent on the conditions of use and the type of equipment), 32 activity data and spatial distribution for some source sectors (e.g. wood burning heating). By  converters. However, an efficient conversion of many other nitrogen-containing compounds 25 (NO z =NO y -NO x ) by the molybdenum converter can lead to interferences in the NO x concentrations 26 (Dunlea et al., 2007). This positive artefact varies from one location to the other, depending on the 27 relative contribution of NO z compounds in the NO y family. Dunlea  measured at a 30 s time resolution with an instrument designed for airborne measurements that consists of three Ecophysics (CLD 780 TH) analyzers in which NO is measured using ozone 1 chemiluminescence. NO 2 is photolytically converted, and NO y is converted with H 2 in a gold covered 2 heated oven (see Freney et al., 2013, for details). The limit of detection is 10 pptv. NO, NO 2 and NO y 3 measurement uncertainties have been estimated by these latter authors to 10, 20 and 20%, 4 respectively. The measured NO y includes the following species : NO, NO 2 , HNO 2 , HNO 3  backscatter lidar data at a 5 min time resolution (Haeffelin et al., 2012). At LHVP, values are 20 estimated from CL31 ceilometer data (Haeffelin et al., 2012). Traditional meteorological parameters 21 (wind, temperature) are also measured at SIRTA where, additionally, Leosphere wind cube lidar 22 measurements are also available, providing wind measurements at a 10 min time resolution, each 20 23 m from 40 to 200 m above ground level (a.g.l.). 24

Emission inventories 25
Three European anthropogenic emission inventories are evaluated in this paper, all referring to year  2. An inventory developed partly in the framework of MEGAPOLI project by TNO. The . It is the year 2005 base case inventory and also serves as a starting point for the 1 inventory described next. In this paper it is referred to as the TNO inventory.  on the TNO inventory and with the same 1/8 o x 1/16 o longitude-3 latitude resolution, but incorporating bottom-up emission data over the four European 4 megacities (Paris, London, Rhine-Ruhr and Po valley). The city emission inventories were 5 compiled by local authorities responsible for city emissions inventories and air quality such as 6

A third inventory based
Airparif for Paris (Airparif, 2010). It is described in more detail by Denier van der Gon et al. 7 (2011), and has been previously used in Zhang et al. (2013) and Timmermans et al. (2013). It 8 will be referred as the TNO-MP (MP for MegaPoli) inventory. 9 The same EC/OC speciation table, primarily associated to the TNO inventory, is applied in all 10 inventories. Sector-dependent factors used to derive July emissions from annual ones are reported in 11 Table S2 in the Supplement. This table also shows the total sectorwise BC July emissions over the 12 region for the three inventories. 13 The resolution of both TNO and TNO-MP inventories is considerably improved compared to the 14 EMEP inventory. Despite its coarse spatial resolution, the comparison of this latter inventory with the 15 two other refined ones remains relevant for several reasons : (i) before being applied to simulations, 16 emissions are downscaled to the air quality model resolution, here to a 3 km horizontal resolution, 17 using the 1x1 km-resolved GLCF (Global Land Cover Facility) landuse database (Hansen et al., 18 2000;Hansen and Reed, 2000), and (ii) concentrations are considered in the Paris plume, i.e. at a 19 rather large spatial scale, which decreases the influence of such a coarse resolution in the emissions. 20 Emissions are apportioned according to several types of landuse : urban, rural, forest, crops and 21 maritime (Menut et al., 2013). Because of their better horizontal resolution, the evaluation of 22 emission inventories in this paper focused mainly on the TNO and the TNO-MP inventories, the 23 EMEP is taken as an additional reference emission inventory, as it is used in many studies in Europe.  The spatial distribution of BC and NO x emissions in the Paris region during a typical working day in 3 July is given in Fig. 1 for each inventory. In order to illustrate the important differences in the spatial 4 distribution of emissions between inventories, one can compute for all inventories the mean 5 emissions of all cells within a certain distance around the LHVP site, from 0 (only the LHVP cell) to 6 80 km (the whole region) (see Fig. S3 in the Supplement). Relatively to TNO-MP, the EMEP 7 inventory BC and NO x emissions in the Paris center are relatively low, but increase further away. The  Conversely, TNO and TNO-MP inventories display significantly higher emissions in the 12 agglomeration center, and lower ones further away. In summer, the TNO-MP inventory shows a quite 13 similar spatial emission distribution as the TNO one. In particular, major highways around Paris, 14 spatially unresolved or missing in the EMEP inventory, are clearly visible thanks to the refined 15 resolution. However, BC emissions in TNO-MP are considerably lower than in TNO in the 16 agglomeration. In absolute terms, discrepancies between both inventories mainly originate from road transport (SNAP sector 7) and residential/tertiary (SNAP 2) sources, and in a lesser extent from 1 waste disposal (SNAP 9), non-road transport (SNAP 8) and industrial process (SNAP 4) sources (see 2   Table S2 in the Supplement for details). However, the highest relative discrepancies (that exceed a 3 factor of 3) are associated to SNAP 2, 8 and 9. For sources considered as area in the top-down TNO 4 inventory (e.g. SNAP 2 and 9), they are likely due to the distribution proxies used to downscale 5 national totals, leading to too high emissions in Paris where the population density is very strong 6 (Denier van der Gon et al., 2010). Concerning the SNAP 4 point sources, discrepancies can probably 7 be explained by the use of generic capacity rules in TNO, rather than exact emissions in the TNO-8 MP. Both inventories are equivalent outside this region. A quite similar pattern is given for NO x 9 emissions, except that discrepancies between both inventories in Paris are much reduced. In terms of 10 BC/NO x emission ratios, highest values are given by the TNO inventory, followed by the EMEP one, 11 and finally the TNO-MP one. Differences are maximum in the center of Paris, and decrease when 12 integrating over larger areas. 13 It is worthwhile noting that as previously mentioned, both PSAP and MAAP instruments are based 14 on the measurement of the light absorption, and observations should thus be referred as EBC.  This point is discussed in Sect. S.1 in the Supplement and in Sect. 4.3.4. In the following, the term 20 BC will be kept for convenience for both observations and simulations.  The CHIMERE model allows simulating transport, gas-phase chemistry, some aqueous-phase 31 reactions, size dependent aerosol species including secondary organic aerosol, dry and wet 32 deposition. It treats coagulation, absorption as well as nucleation aerosol processes. Inorganic aerosol 33 thermodynamic equilibrium is calculated using the ISORROPIA model (Nenes et al., 1998).

Model configuration and simulated cases 1
In this paper, simulations are performed during the summer MEGAPOLI campaign (July 2009) with 2 a five-day spin-up period. Two nested domains of increasing resolution -CONT3 (0.5 x 0.5°, i.e.  In this section, we first evaluate meteorological input data (Sect. 4.1). A first simple approach is then 22 applied to evaluate BC emissions against NO x ones, based on ground based measurements at the 23 urban background LHVP site in Paris (approach n°1, Sect. 4.2). We then describe the procedure to 24 evaluate BC emissions based on airborne measurements in the Paris plume, and present the 25 corresponding results (approach n°2, Sect. 4.3). We finally discuss discrepancies between both 26 Statistical metrics are defined as : 28 • Mean bias: (1)

Evaluation of meteorological data 5
In this section, meteorological input data used in CHIMERE simulations, with both MM5 and WRF 6 models, are evaluated against observations at surface and in altitude. 7 Surface observations. The Fig. 2 shows comparisons between observations and simulations for 8 meteorological parameters obtained at the SIRTA ground site. Statistical results are reported in the 9 Table 1, considering all hours as well as only the 06:00-14:00 UTC time period (designed hereafter 10 as morning hours), more relevant in our methodology since transport from the urban emission 11 sources to the aircraft location occurs in the morning and the early afternoon. on wind speed (+33%). BLH appears strongly underestimated, with a bias of -34%, reduced to -24% during morning hours. Satisfactory correlations are found for temperature (due to diurnal cycle) and 1 wind speed (R around 0.8-0.9), but lower ones are obtained for BLH (around 0.5). Conversely, the 2 WRF model shows better results on temperature (now slightly overestimated, with a bias of +0.45°C) 3 and overall BLH with an underestimation reduced to -17% (and -12% during morning hours). 4 Correlations on this latter parameter are significantly improved compared to MM5 model (0.7 against 5 0.5). As one of the factors contributing to the BLH negative bias, diurnal profile comparisons (see 6 with WRF but is seriously reduced, which explains the better correlations. 9 Table 1 : Statistical results of MM5 (and WRF in parenthesis) considering all July hours and only the 10 06:00-14:00 UTC time window (N represents the proportion of available data).  Table S3 in the Supplement. In 1 average, low negative biases and reasonable NRMSE are obtained with both MM5 and WRF models. At the daily scale, biases on wind speed remain below ±30%, except the 13, 16 and 28 July 3 (respectively the 29 July) during which one or both models give high underestimations (respectively 4 overestimation), up to -46% (respectively +27% with MM5). Errors in terms of NRMSE exceed 25% 5 for all these dates, as well as during the 21 July (above 26%) despite a low bias (error compensation 6 between an underestimation during the first hours and overestimation during the last hours). 7 Wind speed simulation results are much better along the aircraft path (not shown), all biases 8 remaining below ±20% while NRMSE range between 12-32%. The highest biases occur the 1 and 28 9 July, around -18%. From a general point of view, the moderate positive bias found on wind speed at 10 ground vanishes in altitude except at mid-altitude during specific days, leading to a noticeable 11 decrease of the NRMSE. afternoon (after 18:00 UTC). NO x is also overestimated, but mainly during the end of the day. 20 Observed BC/NO x ratios are rather constant (0.06 µg m -3 ppb -1 in average) in July except during 21 some nights, but with a diurnal pattern showing lower values around 5 UTC and higher ones around 22 midnight. CHIMERE also simulates rather constant ratios but with a positive bias with TNO and to 23 less extent in EMEP inventories, while bias with TNO-MP emissions is rather small (< 13%). We now evaluate in some more detail BC emissions relatively to NO x ones. In order to be 10 comparable with the airborne approach, only flight days are considered, but some results over the 11 whole month of July will also be indicated. It is worthwhile noting that, even if both BC and NO x are 12 mainly locally emitted within the Paris agglomeration, biases may be partly related to errors in 13 advected contributions: BC can be transported from outside (like on July 1), while some NO x may be 14 advected during the night and the early morning (when its photolytic conversion into HNO 3 or 1 HONO is less active) or released by reservoir species (e.g. PAN). NO x measurements at two rural 2 background stations in the south and south-east of Paris are available from the AIRPARIF network. 3 In average, the NO x regional background roughly accounts for 15-25% and 15-30% of the levels in 4 Paris for observations and simulations, respectively. From one year measurements during the 5 PARTICULES campaign (Bressi et al., 2013), it has been found that the BC regional background 6 contributes to about a third to the annual BC urban background average in Paris, and that this fraction 7 is probably underestimated by the CHIMERE model (Petetin et al., 2013). However, this uncertainty 8 source remains difficult to quantify more precisely. Additionally, NO x chemiluminescence 9 measurements may also include some NO z compounds (Dunlea et al., 2007), but this is not likely a 10 large error source since the CHIMERE model gives an average NO x /NO y ratio above 92% at the 11 Given all these elements, we thus consider the BC/NO x ratio over the 05:00-08:00 UTC time 13 window, corresponding to rush hours where fresh NO x and BC are expected to dominate. BC versus 14 NO x concentrations during that time window are represented in Fig. 4. Simulated slopes of BC versus 15 NO x reported in Table 2 show a high overestimation with respect to observed ones for the TNO 16 inventory, around a factor of 4. Overestimations are reduced to a factor of 2.8 and 2.2 for EMEP and 17 TNO-MP inventories, respectively. Uncertainties on emission error factors (at a 95% confidence 18 interval) are quite the same for all inventories, around 18%, since they essentially originate from the 19 uncertainty on the slope deduced from observations (i.e. BC/NO x ratios are more variable in 20 observations than in simulations). Note also that discrepancies between BC versus NO x slopes and 21 BC/NO x ratios (for the latter biases remain below +136%) are due to the diurnal variability of the 22 measured BC/NO x ratio that shows the lowest values during the 05:00-08:00 UTC time window. 23 Results finally indicate an overestimation of BC emissions relative to NO x emissions, particularly in 24 both top-down inventories (EMEP and TNO), that is significantly reduced with local bottom-up 25 information integrated in the TNO-MP inventory. It is worthwhile noting that, as a burning process-26 related species of long lifetime, carbon monoxide is another appropriate candidate for the evaluation 27 of BC emissions (Zhou et al., 2009). However, it should also be mentioned that, due to its significant 28 background concentrations, higher uncertainties (compared to NO x ) may arise from errors in the 29 simulation of the regional background around Paris, even considering only rush hours. 30

Approach n°2 : emissions evaluation from airborne measurements 3
Given these first results obtained at ground, the alternative approach based on airborne measurements 4 in the plume is now presented. The procedure is first described in details, and the results are then 5 shown and their uncertainties discussed.

Proceduretocomputeemissionerrorfactors(EEF) 7
As an illustration, the TNO/MM5 case for two flights on the 10 and 13 July are considered. Aircraft 8 trajectories and BC concentrations during these days are presented in Fig. 5. As previously 9 mentioned, the inlet used to collect BC particles is characterized by a 50% passing efficiency 10 aerodynamic diameter of 5.0 µm, and BC measurements are thus compared to the simulated BC 11 concentration below 5 µm. Time series given in Fig. 6 show a series of peaks that correspond to Havre, two industrial cities.  Given that the aircraft does not exactly cross the plume perpendicularly, but with an angle α between 18 a linear relationship between point source emissions and concentrations in the plume, an emission 1 error factor is finally defined for each flight as the ratio of the simulated area over the observed one 2 (i.e. an error factor of two means an emission overestimation of 100%). 3 The emission error factor is finally defined as : 4 It is worthwhile noting that such an evaluation applies to the combination of : (i) the PM emission 6 inventory, (ii) the PM speciation into BC, (iii) the monthly emission factor for July and (iv) the 7 hourly emission profile. If uncertainties are expected to be larger on the two first elements, the two 8 others may also contribute to the errors. In the following discussion, it is to be kept in mind that the 9 reference to BC emissions aggregates all these elements.   Table S4 in Supplement). On average, ratios 6 appear underestimated in EMEP and TNO-MP inventories (-18 and -46%, respectively), while a very 7 low EEF (+7%) is found for the TNO inventory. However, the day-to-day variability remains as high 8 as that of BC and NO x taken individually, with an uncertainty on the mean around a factor 1.31-1.41. 9 In particular, rather small BC/NO x error factor ratios are obtained the 1, 13 and 21 July (and the 25 10 with MM5 meteorology), compared to the other days. (on the right) with confidence interval, for all six simulated cases (logarithmic scale).
Such a high day-to-day variability both in individual compounds EFF and in their ratio was not 1 expected, which raises the question of its origin: does it come from the real-world emissions (missing 2 in the model emission input data), or is it induced by uncertainties in the methodology, or both? In 3 the next subsection, the variability potentially associated to observations themselves is discussed, 4 while the variability that may come from the methodology (e.g. model errors) will be investigated in 5 Sect. 4.3.4. 6

Variabilityinobservations 7
When investigating ratios of the BC area over NO y area for observations and simulations separately 8 (see Fig. S9 in Supplement), one can notice that whatever the inventory, simulated BC/NO x ratios 9 remain rather constant from one flight to the other, leading to small uncertainties on the average 10 value. Conversely, ratios derived from observations are much more variable, with higher values the 11 1, 13 and 21 July compared to other days (ratios above 0.15 against 0.06-0.09, i.e. close to a factor of 12 2), which induces uncertainties on the BC/NO x average value in the same order of magnitude than 13 those obtained on BC and NO x separately. The model fails to reproduce such an enhancement during 14 those days. The reasons for such an increase are not clear, but we can discuss some possible sources 15 of variability. 16

Regional background heterogeneity. On July 1, BC measurements around Paris (and notably 17
upwind of the city) show rather high but noisy concentrations (see Fig. S10 in Supplement), which 18 suggests a possible heterogeneity in the BC regional background. In our methodology, a unique 19 regional background value is estimated, based on the whole flight. In the case of a rather slender BC 20 plume coming from the north in the axis of the Paris and adding itself to the city plume, our 21 procedure would thus not be able to discriminate both. This may explain the high BC/NO y ratio 22 observed on that day. contribution of emissions at a specific hour to the overall plume. The Fig. 9a gives an illustration for 34 m s -1 , see Table S3 in Supplement). Tracer emissions follow a working day emission profile. Early 1 morning emissions of the day (in cold colors) dominate the two last peaks, while the latter emissions 2 contribution (in green and hot colors) progressively increases in earlier peaks. The contribution of 3 emissions at a specific hour is given by the integral ratio of the associated tracer concentration over 4 the total concentration (black area on the figure). On this flight, the aircraft thus sample emissions 5 over a quite large time window (00:00-11:00 UTC), with main contributions originating from 05:00-6 07:00 UTC emissions (that account for 50% of the total area). The procedure is repeated for each Note that the use of WRF is not expected to substantially modify the results obtained here with MM5 18 since major discrepancies between both meteorological model outputs only concern the BLH starting 19 from 10:00 UTC and that emission tracers are here investigated relative to each other. 20 given in Table S3 in the Supplement. Note that, as flights occur in the afternoon, only the 16 first 27 hours of the day are represented.

Uncertaintiesoftheinversionmethodology 1
The methodology used to evaluate NO x and BC emission inventories based on aircraft data over the 2 Paris region intends to minimize several error sources : (i) the representativeness error by considering 3 concentrations in the plume rather than at ground, (ii) modeled chemistry errors by considering inert 4 tracer species/families, and (iii) lateral dispersion and plume direction errors by considering 5 integrated concentrations. The high emission error factors day-to-day variability previously noticed is 6 partly due to a variability in Paris agglomeration emissions, but such large discrepancies are not 7 expected, and are indicative of other uncertainty sources that must be at stake, among which : (i) the 8 wind field errors and their impact on emissions really sampled by the plane, (ii) errors in BLH and 9 vertical mixing, (iii) errors in deposition, and finally (iv) discrepancies between EBC and EC. All been found in airborne measurements along the flight path. If these uncertainties are taken as representative for the average wind between the emission source and the aircraft, the typical 1 displacements of the emission time window are less than about 1h. This would induce significant 2 errors (say above 10%) only for time windows between 05:00 and 08:00 UTC, when the temporal 3 gradient in emissions is strong. Thus this error source should not be of major importance to explain 4 the variability in results. 5 Vertical mixing. As previously highlighted, aircraft measurements are expected to have a higher 6 spatial representativeness than at ground, but this relies on the assumption that the vertical mixing in 7 the BL is correctly established, so that observations obtained in the plane, at an altitude of about 600 8 m a.g.l., can be considered as representative of those in the whole plume. The vertical heterogeneity 9 is expected to be the highest above the city and to decrease gradually along the plume due to the 10 turbulent mixing and the absence (or the relatively poor contribution) of fresh emissions at ground 11 outside the city. The vertical turbulent mixing parametrization in the CHIMERE model follows the since the increased concentrations due to a lower BLH may for instance be reduced by a higher dry 31 deposition (that depends on concentrations in lowest level). In order to assess the importance of these 32 errors, a sensitivity test is performed with the EMEP/MM5 case by increasing the BLH by 30% 33 (corresponding to the mean bias between 06:00-14:00 UTC). So far, simulated cases have been 34 performed with prognostic turbulent parameters (i.e. directly taken from meteorological models). 35 However, as the diffusivity coefficient depends on the BLH, the sensitivity test with BLH multiplied by 130% is performed with the diagnostic option, in which K z is calculated within the CTM among 1 others as a function of the BLH. Except for some specific dates (10, 20 and 28 July), a larger BLH 2 leads to lower concentrations and therefore decreases error factors (see relative changes in Fig. S12  3 in the Supplement). On average, changes are around -14% for both BC and NO x , and have rather no 4 influence on the mean BC/NO x ratio (-1%). Thus, the uncertainty in BLH could both contribute to the 5 variability and bias in BC and NO x emission error factors, while the BC/NO x ratio is rather 6 unaffected by these errors. 7 Deposition. BC and NO y are expected to be conservative at the time scale of the flight, but they both 8 undergo deposition. Errors in the simulated deposition and/or in the NO y speciation (given the large 9 differences of deposition rates among NO y individual compounds) may impact emission biases 10 results. Meteorological conditions indicate that wet removal is likely to be negligible over the 11 campaign region, and the deposition is thus essentially dry. In order to assess the influence of    The first ones are difficult to quantify but can reasonably be considered as random. Also 14 measurement uncertainties are probably mostly random, but may include a part of systematic 15 uncertainties. In order to be conservative, they are assumed entirely as systematic. Uncertainties in  Table 3, as well as final confidence intervals on our estimation of emission error factors. 28 Confidence intervals (at a 95% confidence interval) on average emission error biases deduced from 1 the overall uncertainty are reported in Table 4. For NO x emissions, positive biases are found in all 2 inventories. Considering the 95% confidence intervals, the bias in the TNO inventory appears 3 statistically insignificant, which may not be the case in both EMEP and TNO-MP inventories for 4 which a slight overestimation remains probable (due to a confidence interval lower bound of -4%, 5 thus very close to zero). These are in the range of the 35% agreement found for NO x emissions in 6 Paris during the ESQUIF project in summer 1999 by Vautard et al. (2003) also based on airborne 7 measurements and CHIMERE simulations, but using an alternative method, and an older emission an uncertainty of about ±20 %. Note that these studies were performed using different emission 13 inventories. Given the uncertainties, the NO x emissions positive bias around 20-30% found here in 14 most of the inventories does not appear as significant. 15 indicates a probable overestimation of BC emissions in that inventory. As previously mentioned in Sect. 3.2, the overestimation of BC emissions in the TNO inventory can probably be explained by the 1 spatial distribution procedure that concentrates too large emissions in the city. For example, using 2 population density as a proxy implies the assumption of constant per capita emissions over the 3 country which might lead to an overestimation of urban BC emissions as discussed in Timmermans 4 et al., (2013) and references therein. At this stage, it is to be emphasized that discrepancies between 5 EMEP and TNO BC results are mostly related to differences in their spatial resolution since input 6 data for national totals are similar. Accordingly, Paris region total emissions in July are quite similar 7 in both EMEP and TNO inventories (as shown in Table S2 in Supplement). However, at the scale of 8 the Paris agglomeration, total emissions in both inventories do show discrepancies, TNO emissions 9 being more concentrated in the city due to its finer resolution while the EMEP emissions spill over in 10 rural areas of Paris region due to their coarser resolution (see Sect. 3.2). Therefore, to our sense, the To our knowledge, a BC emissions evaluation as presented here has not yet been attempted at the 24 scale of a large megacity, and uncertainties estimated at the global or regional scale are difficult to 25 extrapolate to an agglomeration. For comparison, through their adjoint inverse modeling exercise 26 over Asia, Hakami et al. (2005) have found quite consistent total assimilated and base case BC 27 emissions over Asia, but have underlined higher discrepancies at regional scale, with major errors 28 over Japan, northern and southern China of about a factor ± two. 29

Surface versus airborne results : representativeness issues 30
Results obtained at ground in Paris show a high overestimation of the BC/NO x ratio in the TNO (~ 31 factor 4) and EMEP (~ factor 3) inventories, and at a lesser extent in TNO-MP (~ factor 2). This is 32 not consistent with results obtained in the plume where the BC/NO x emission ratio appears highly 33 underestimated in TNO-MP (while errors are lower for EMEP and TNO). Several reasons may at 34 least partly explain these discrepancies between ground and airborne results. The main one is 35 probably the difference of representativeness between both approaches, ground concentrations being influenced by emissions in the vicinity of the LHVP station, while concentrations in the plume 1 integrate emissions at a much larger scale (the whole agglomeration). In order to assess the LHVP 2 site representativeness, a simulation with spatially traced emissions around that site is performed over 3 a few days (see Sect. S.4 in the Supplement). LHVP concentrations appear mainly influenced by 4 close emissions, with a contribution of 50-85% from emissions within a radius of 6 km around the 5 site. Conversely, beyond a radius of 21 km (which still covers the agglomeration), emissions 6 contribute to less than 10%. These contributions are quite variable depending on the wind field, the 7 importance of close emissions increasing with stagnant conditions. Since BC and NO x emissions as 8 well as their ratio are highly heterogeneous over the whole Paris region (see Sect. 3.2 and Fig. S3 in  9 the Supplement), results obtained at the LHVP site thus cannot be representative for the whole 10 agglomeration, but probably only for its central part. 11 Additionally, the previous tracer experiment takes into account neither the sub-grid emissions 12 heterogeneity at a resolution of 3x3 km (e.g. a park and a stretch of the Paris ring road are included in 13 the LHVP cell) nor sub-cell processes, caused by the high complexity of urban environments (e.g. 14 street canyons, building-induced turbulence). The LHVP spatial representativeness may thus be even 15 lower. Working at the plume scale strongly reduces these limitations since (i) all emissions within the 16 agglomeration end up in the plume, and (ii) mixing during a few hours of transport from the source 17 regions to the measurement locations is expected to significantly increase the concentration 18

representativeness. 19
This would therefore suggest that the best BC/NO x emission ratio is given by the TNO-MP inventory 20 in the Paris center, while it highly underestimates the ratio at the scale of the whole agglomeration, 21 contrary to the TNO inventory which gives better results. Compared to TNO-MP, NO x emissions in 22 TNO are quite similar while BC ones are higher and more concentrated in the center of Paris. 23

Conclusion 24
Black carbon (BC) emissions are still highly uncertain, and very few studies have attempted to 25 evaluate their inventories. This paper presents an original approach, based on airborne measurements 26 across the Paris plume, developed in order to evaluate BC and NO x emissions at the scale of the 27 whole agglomeration. It is applied to three emission inventories (EMEP, TNO, TNO-MP). In order to 28 assess the benefit of such a methodology, BC/NO x ratios at the LHVP ground site in Paris are first 29 investigated. Over the whole July month, they show a significant (at a 95% confidence interval) 30 overestimation in all inventories with biases ranging between a factor of 2 in TNO-MP and a factor 31 of 4 in TNO. On average, results obtained from July airborne observations give an overestimation of 32 NO x emissions around +20-30% for all inventories, a moderate bias around +12 and -23% for EMEP 33 and TNO-MP BC emissions, respectively, but a higher positive bias of +40% for TNO BC inventory. 34 However, these results present an unexpected high day-to-day variability (up to a factor of about 3).
Low biases are also obtained on BC/NOx emission ratio for EMEP and TNO inventories (-18 and 1 +13%, respectively), contrary to the TNO-MP inventory that shows an underestimation of -44%. 2 Various uncertainty sources in the methodology are investigated through sensitivity tests -wind 3 field errors, boundary layer height, vertical mixing, deposition, BC nature (equivalent BC versus 4 elemental carbon) -and are likely to explain this variability. Results of these tests are used to derive 5 a systematic uncertainty between 35 and 48% on emission error factors. This suggests that a 6 moderate overestimation of NO x July emissions in EMEP and TNO-MP inventories is statistically 7 probable. Biases found in EMEP and TNO-MP BC emissions are not significant. However, the 8 overestimation in TNO BC emissions does appear as significant. It is probably due to the distribution 9 proxies used to downscale national total emissions that concentrate too large emissions in a highly 10 populated area such as Paris with lower per capita emissions. The BC/NO x emission ratio appears 11 underestimated in the TNO-MP inventory, while non-significant biases are obtained with both EMEP 12 and TNO inventories. While discrepancies between EMEP and TNO inventories are likely due to  Finally, best estimations of BC and NO x emission biases thus do not exceed ±40%, which appears as 19 rather moderate considering the numerous uncertainties at stake in the construction of an inventory. 20 Due to methodological uncertainties in the same order of magnitude, assessing the significance of all 21 these results remains difficult. However, the methodology does succeed in highlighting some 22 statistically significant biases and in particular, for BC or NO x emissions or for the BC/NO x ratio, at 23 least one of the three inventories has been proven as very probably biased. It is worthwhile noting 24 that the methodology used in this study not only evaluates an inventory by itself but also a particulate 25 matter speciation table and a temporal disaggregation (monthly and diurnal) that are also subject to 26 potential errors. 27 To our knowledge, this study is one of the most comprehensive ones to evaluate BC emissions at the 28 scale of a large megacity. The comparison of aircraft-and ground-based results has given an 29 interesting insight on the potential error compensation in the spatial allocation of BC emissions over 30 a large agglomeration. In the framework of the PRIMEQUAL PREQUALIF project, a dense BC 31 network of 14 stations (of various typologies, e.g. rural, urban, traffic) has been installed over the 32 Paris region. It will allow a better characterization of the BC spatial distribution over the 33 agglomeration, and in the line of this, an interesting outlook would thus be to compare it to the 34 simulated spatial distribution constrained by emission inventories.

Acknowledgements 2
The research leading to these results has received funding from the European Union's Seventh 3 Framework Programme FP/2007-2011 under grant agreement no. 212520. The authors also 4 acknowledge the ANR through the MEGAPOLI PARIS and ADEME and LEFE through the 5 MEGAPOLI France project for their financial support. This work is funded by a PhD DIM (domaine 6 d'intérêt majeur) grant from the Ile-de-France region. We would like to thank the two anonymous 7 referees for their valuable comments on this work. 8