The constraint of CO 2 measurements made onboard passenger aircraft on surface-atmosphere fluxes: the impact of transport model errors in vertical mixing

. Inaccurate representation of atmospheric processes by transport models is a dominant source of uncertainty in inverse analyses and can lead to large discrepancies in the retrieved flux estimates. We investigate the impact of 10 uncertainties in vertical transport as simulated by atmospheric transport models on fluxes retrieved using vertical profiles from aircraft as an observational constraint. Our numerical experiments are based on synthetic data with realistic spatial and temporal sampling of aircraft measurements. The impact of such uncertainties on the flux retrieved using the ground-based network with those retrieved using the aircraft profiles are compared. We find that the posterior flux retrieved using aircraft profiles is less susceptible to errors in boundary layer height as compared to the ground-based network. This finding 15 highlights a benefit of utilizing atmospheric observations made onboard aircraft over surface measurements for flux estimation using inverse methods. We further use synthetic vertical profiles of CO 2 in an inversion to estimate the potential of these measurements, which will be made available through the IAGOS (In-Service Aircraft for a Global Observing System) project in future, in constraining the regional carbon budget. Our results show that the regions tropical Africa and temperate Eurasia, that are under-constrained by the existing surface based network, will benefit the most from these 20 measurements, with the reduction of posterior flux uncertainty of about 7 to 10 %.

10 of the profiles. There are some earlier studies looking at aircraft vertical profiles and their use in inversions. Gloor et al (2000) have considered aircraft vertical profiles in their studies on observing network extension.
A difference between fluxes estimated using near-surface observations and column average of the vertical profiles was discussed by Nakatsuka and Maksyutov (2009

15
[AR]: These two references will be added to the manuscript and the text will be modified as follows: [ME]: "Both studies focused specifically on the estimation of the tropical terrestrial fluxes using mostly the free tropospheric part of the aircraft profiles. Gloor et al (2000) used aircraft vertical profiles in their studies for observing network extension. A difference between fluxes estimated using near-surface observations and column 20 average of the aircraft vertical profiles was discussed by Nakatsuka and Maksyutov (2009). However, so far, the suitability of aircraft vertical profiles and their treatment when using them into inversions, given the transport modelling errors related to vertical mixing has not been addressed." 25 [RC] P3L37: Equation 1 is written or described incorrectly; in place of Cini should be the result of forward simulation with initial concentration Cini.
[AR]: C ini is the initial atmospheric mixing ratio of the transport model at the beginning of the simulation period.

30
[RC] P4L25: Equations 4b and 6 give impression that prior flux error covariance matrix is omitted. These equations look different from Jena inversion system described by Rodenbeck et al (2005). Authors should review the Equations (3-6, 14) in (Rodenbeck et al 2005) and explain the changes, in case there are some.
[AR]: This part of the text will be condensed and the equations have been modified to resemble those in Roedenbeck et al.
[RC] P10L35: More discussion can be added on this topic. The transport model used in this study may not be best one for actually analyzing the IAGOS observations in PBL, due to a need to resolve plumes of anthropogenic CO2 transported from 2 large cities near the airports. A relatively high model-observation mismatch of 5 ppm at 1 km as shown on Fig. 1 was found for CONTRAIL data. High model data mismatch (mdm) could partly be a result of applying low resolution (with respect to city plume size) model and meteorology, thus it should be considered as upper bound on mdm. Using the data uncertainty based on CONTRAIL mismatch for IAGOS looks justifiable with current transport model, and large data uncertainty may have resulted in relatively low flux uncertainty reductions in the order of 10. The ability of the low resolution model to 5 simulate CO2 concentration in the megacity plumes is questionable, with possible underestimation of fossil CO2 component due to low model resolution (model is low biased), affecting the estimated fluxes.
[AR]: The discussion related to the transport model resolution will be added in the text.
[ME]: The text will be modified as follows: 10 "We must bear in mind that since the MOZAIC/IAGOS aircraft profiles are measured near the airports, which form areas of high anthropogenic emissions, it is likely that these observations are not truly representative of large areas.
This fact has been taken into account, in this study, in a conservative way by estimating the model data mismatch uncertainty using the difference between CO 2 profiles from the CONTRAIL project and reanalysed TM3 fields (Sect. 2.2). However, better approaches for addressing this question of representativeness of aircraft profiles exist, 15 for example, those described by Boschetti et al. 2015. A relatively high model-observation mismatch of 5 ppm at 1 km (as shown in Fig. 1) for CONTRAIL data could partly be a result of applying low-resolution (with respect to plume size of anthropogenic CO 2 transported from large cities near the airports) model and meteorology and thus should be considered as an upper bound on the model data mismatch." 20 [RC] P1L14: Suggest correcting "ground-based" to "ground-based"

30
[ME]: Text will be modified as follows: "Lack of measurements in the atmosphere or an unevenly distributed network of observation sites can result in a poorly constrained regional carbon budget (Gurney et al. 2002)." [RC] P2L16: Suggest correcting "Checa-Garcia" to "Checa-Garcia" 35 [AR]: Done 40 Reviewer # 2 [RC] P1L14 : I would include a noun after a word like "this" or "these" so it is clear to the reader what you are referring to in the sentence.
[RC] P1L14: consider changing "the benefit" to "a benefit" [AR]: The text will be modified to clarify this point: [ME]: "This finding highlights a benefit of utilizing atmospheric observations made onboard aircraft over surface 5 measurements for flux estimation using inverse methods." [RC] P1L20 : Suggested rewording: "with a reduction of posterior flux uncertainty of about 7 to 10%." [AR]: Text will be modified as per the suggested rewording [ME]: "Our results show that the regions tropical Africa and temperate Eurasia, that are under-constrained by the 10 existing surface based network, will benefit the most from these measurements, with the reduction of posterior flux uncertainty of about 7 to 10 %." [RC] P1L22: add a comma before "and"

15
[AR]: Done [ME]: "Reliable prediction of climate change scenarios requires a thorough understanding the carbon-climate feedbacks in the earth system, and accurately estimating current sources and sinks of carbon is of prime importance." 20 [RC]: P1L24: What does "these" refer to here? I would include a noun after "these".
[AR]: The phrase "sources and sinks" will be added to this sentence.
[ME]: While it is impossible to measure these sources and sinks directly everywhere around the globe, we may estimate these using the 'top-down' approach employing atmospheric observations in combination with knowledge of atmospheric transport and prior knowledge of the fluxes by inverse modelling.
[AR]: The reference Gerbig et al.,2003 will be added to this sentence [ME]: Unfortunately, the estimates of surface fluxes using this approach are prone to large uncertainties that can largely be attributed to imperfections in the transport models and insufficient data coverage by the observation 30 network (Gerbig et al., 2003).
[AR]: references Stephens et al., 2007;Gerbig et al., 2008 will be added to this sentence [ME]: One of the dominant sources of transport model uncertainty is the inaccurate representation of the vertical 35 mixing near the surface of the earth and hence the boundary layer height (Stephens et al., 2007;Gerbig et al., 2008).
[RC] P2L11-12: This point is true of short towers but not necessarily of tall towers (e.g., the NOAA tall tower network in the US).
[AR]: The text will be modified to clarify this statement 40 [ME]: In addition, these measurements except those obtained from tall towers, are often not representative of large areas and provide information only at the local scale (Haszpra et al. 1999).

4
[RC] P2L14: Instead of "constraints", I would use a word like "limitations" ("... they have their limitations, too, which restricts their use for accurate flux ...."). The word "constraints" makes it sound like the satellite is constraining something (e.g., fluxes), not that the satellite has limitations.
[AR]: As per the suggestion, the word "constraints" will be replaced by "limitations" 5 [ME]: "However, they have their limitations, too, which limits their use for accurate flux estimation using inverse methods." [RC] P2L41: Suggested rewording: "... investigate theoretical impacts of transport model ...." [AR]: The text will be modified according to the suggested rewording 10 [ME]: "In this paper we employ synthetic data to investigate theoretical impacts of transport model uncertainties associated with boundary layer height on the fluxes retrieved by using passenger aircraft profiles in an inverse modelling set-up." [RC] P3L10-17: Some of the information here seems redundant with information in the previous two paragraphs. You may 15 want to eliminate these lines or condense with the previous two paragraphs.
[AR]: The text in these lines will be removed to avoid repetition with information in the previous paragraph.
[RC] P3L20: I prefer active voice (E.g., "section 3 presents results") over passive voice ("In section 2, the results are presented"). This wording could be a matter of personal preference.

20
[AR]: Active voice will replace passive voice in this sentence.
[ME]: "Section 2 describes the methods used that include estimation of the model representation error (Section 2.1), description of the inversion scheme (Section 2.2) and the experimental set-up (Section 2.3). Section 3 presents the results from the simulations and the conclusions are discussed in Section 4."

25
[RC] P3L29: A 4x5 degree resolution seems really coarse for CO 2 simulations. At this juncture, I imagine it would be difficult to find a higher resolution model and re-run all of the CO2 model simulations. However, somewhere in the paper (e.g., in a supplement), it might be useful to explain why you used this resolution and how the resolution affects your interpretation of the results.

30
[AR]: We agree that the resolution of the model used is not the best that is currently available. However, similar resolutions have been used in the past in other intercomparison projects like RECCAP (Peylin et al.,2013) and Houweling et al., 2015).
The impact of the resolution is likely to be on the model-data mismatch calculated here using the CONTRAIL data, which should be taken as an upper bound. This fact will be incorporated in the discussion section.

35
[RC] P3L31: I would include a noun after "following." [ME]: Text will be modified as follows: "In the following paragraphs, we provide a brief description of the inversion system described in more detail in Roedenbeck et al. (2005)."

40
[RC] P4Eq. 4b: Many inverse modeling studies include an a priori covariance matrix that defines uncertainties in the prior and spatial/temporal covariances in these covariances. Instead, you have used a weighting factor (mu). It could be helpful to explain why you have chosen the latter approach over the former.

5
[AR]: The factor 'mu', in this manuscript, is not related to the prior flux uncertainty or the spatial and temporal covariances in the fluxes. It is simply a factor the scales the contribution of the prior flux constraint on the inversion. It is a ratio between the a-priori information and data constraint.

5
[RC]: P4L12: The symbol mu usually refers to the mean in statistics. A different variable name here could prevent confusion. Section 2.1.1: This section contains a lot of information about the fundamentals of Bayesian inverse modeling, and it appears that a lot of this information has already been published elsewhere. You could either condense this section or move the material to a supplement.

10
[AR]: The symbol mu cannot be changed due to the fact that it is a part of published work (Roedenbeck at al., 2005) and therefore will have to be used as such. As per the suggestion, this section will be condensed to avoid repetition with the published literature.

15
[RC] P4L38: Are there many stations that have weekly flasks? I know of a number of stations with daily flasks. You could change this sentence to "... made once per day or once per week ...." [AR]: While most of the flask stations used in this study are weekly, there are about a couple that make measurements once a week. The text in this sentence will be modified as follows.

20
[ME]: "While stations based on flask observations have measurements made once per day or once per week, there also exist a growing number of continuously measuring stations with data provided typically half hourly or hourly." [RC] P5L5-24: You could condense this information or move it to a supplement if it 25 has been published elsewhere.
[ME]: This part of the manuscript will be condensed as follows: "For surface network sites, to avoid a higher impact of the more frequent continuous observations compared to the less frequent flask observations, the data density weighting considers, for every observation, the number of 30 observations N surf within the same week. The total uncertainty for that observation increases by a factor of !"#$ .
These N surf measurements have their errors correlated and this error inflation by a factor of !"#$ helps lessen the impact of measurements that are not independent of each other and hence their contribution to the cost function.
The aircraft is a moving platform, which means that the aircraft profiles span a considerable horizontal and vertical 35 distance while making measurements. Therefore, in contrast to a fixed station, the CO 2 concentration along the profile can be expected to de-correlate due to distance, even if taken within a short period of time. We need to incorporate this fact in the de-weighting scheme." [

6
[AR]: This definition of model-data mismatch has been incorporated from the paper by Engelen et al., (2002) in which they have referred to this quantity as "External Representation Error". The term "model simulated spatial averages" refers to the spatial average over model grid box. This definition includes both spatial and temporal averaging.

10
[ME]: The text will be modified as follows: "This similarity in sensitivity of posterior flux between simulation types C and S shows that the effect of the surface network dominates the flux retrieval from the observations using the combined network and indicates that the surface network stations largely contribute to the sensitivity of the retrieved flux to the uncertainty of the boundary layer height."

15
[RC] P8L28: The word "constraint" is potentially confusing here. You could use a different phrase like "evaluate the utility of the aircraft measurements ...." [ME]: The text will be modified as follows: "In this section, we evaluate the utility of aircraft measurements of CO 2 from IAGOS for constraining the regional 20 carbon budget." [RC] P8L39: The phrase "cause a constraint" may not be the optimal wording here. Instead, you could use a phrase like the following: "West winds mean that observations in these regions are sensitive to boreal fluxes." [ME]: The text will be modified as follows:

25
"For instance, the value of the uncertainty reduction over North American boreal regions (75 %) is high inspite of insufficient surface stations in that region. This can be attributed to the impact of the westerly winds flowing over Temperate North America. West winds mean that observations in these regions are sensitive to boreal fluxes." [RC] P9L8: I would re-phrase "least constraint in the fluxes." Instead, you could try "least constrained by the combined 30 network ...".
[ME]: The text will be modified as follows: "Tropical Asia is the least constrained by the combined network since it is not adequately covered by either of the networks-surface or the passenger aircraft."

35
[RC] P9L12: Throughout the article, the word "constraint" is used in a variety of different contexts with different meanings.
I would use this word in a single context and choose different words in different contexts. In this line, I think the phrase "uncertainty reduction" might be more appropriate.
[ME]: The text will be modified as follows: "Tropical and Eurasian temperate regions show the greatest change in the uncertainty reduction of the posterior fluxes on 40 addition of pseudo observations from IAGOS (about 7 to 10 %)." [RC]: P9L24: I would remove the word "however." 7 [ME]: Done "The Tropics, on the other hand, show a comparable trend and increase in the change of flux uncertainty with up to 10 times fewer measurements than in the Northern hemisphere." [RC] P9L25-26: I think this sentence is a run-on. I would put a period after "network" 5 and start a new sentence with the word "hence." Also, I would add a noun after the word "this".
[ME]: Done "This difference in uncertainty reduction change in the two regions is likely to be due to the fact that unlike the Northern hemisphere, the tropics are not well constrained by the existing network. Hence, the addition of IAGOS 10 profiles leads to considerable constraint on the surface fluxes." [RC] P10L7 and 10: What do the words "this" refer to in each sentence? I would add a noun after the word "this" in each case.
[ME]: The text will be modified as follows:

15
"In other words, this mismatch shows that the transport model uncertainties related to boundary layer height are very likely to be translated to the posterior flux when surface measurements are used as constraint in the inversion while these errors are not propagated to the retrieved flux when the aircraft profiles are used. This difference in the response of the flux retrieved using the two observation networks is likely to be due to the fact that vertical transport, whose effect we simulate by the redistribution of the tracer mass in the model profile at the location of the 20 airports and surface stations, only redistributes the tracer mass between the boundary layer height and the free tropospheric part keeping the total tracer mass constant." [RC] P10L22: I would change "reduce" to "decline".
[ME]: Done 25 "Although we only account for errors in fluxes due to vertical mixing in our simulations, we can say that flux estimation using aircraft profiles is expected to be more robust when aircraft profiles are used as constraint since the contribution of the boundary layer height uncertainty to the overall transport model error is likely to decline." [RC] P10L28: This sentence contains a long clause that makes the sentence difficult to follow. Suggested edit: "We find that 30 IGOS flights will likely provide a strong constraint on regional CO2 flux totals." [ME]: The text will be modified as follows: "Furthermore, on estimating the impact that the CO 2 measurements made onboard the IAGOS fleet are likely to have on the regional carbon budget once they are available, we find that the IAGOS flights will likely provide a 35 strong constraint on regional flux totals." Correspondence to: S. Verma (sverma@bgc-jena.mpg.de) Abstract. Inaccurate representation of atmospheric processes by transport models is a dominant source of uncertainty in inverse analyses and can lead to large discrepancies in the retrieved flux estimates. We investigate the impact of 10 uncertainties in vertical transport as simulated by atmospheric transport models on fluxes retrieved using vertical profiles from aircraft as an observational constraint. Our numerical experiments are based on synthetic data with realistic spatial and temporal sampling of aircraft measurements. The impact of such uncertainties on the flux retrieved using the ground-based network with those retrieved using the aircraft profiles are compared. We find that the posterior flux retrieved using aircraft profiles is less susceptible to errors in boundary layer height as compared to the ground-based network. This finding 15 highlights a benefit of utilizing atmospheric observations made onboard aircraft over surface measurements for flux estimation using inverse methods. We further use synthetic vertical profiles of CO 2 in an inversion to estimate the potential of these measurements, which will be made available through the IAGOS (In-Service Aircraft for a Global Observing System) project in future, in constraining the regional carbon budget. Our results show that the regions tropical Africa and temperate Eurasia, that are under-constrained by the existing surface based network, will benefit the most from these 20 measurements, with the reduction of posterior flux uncertainty of about 7 to 10 %.

Introduction
Reliable prediction of climate change scenarios requires a thorough understanding the carbon-climate feedbacks in the earth system, and accurately estimating current sources and sinks of carbon is of prime importance. While it is impossible to measure these sources and sinks directly everywhere around the globe, we may estimate these using the 'top-down' 25 approach employing atmospheric observations in combination with knowledge of atmospheric transport and prior knowledge of the fluxes by inverse modelling. The inverse modelling scheme exploits the fact that the spatial and temporal variations of atmospheric trace gases like CO 2 contain information about the exchange processes between the atmosphere and the surface of the earth. Unfortunately, the estimates of surface fluxes using this approach are prone to large uncertainties that can largely be attributed to imperfections in the transport models and insufficient data coverage by the observation network 30 (Gerbig et al., 2003).
Atmospheric transport models use meteorological input like wind fields to link the observed atmospheric concentrations of tracers to the estimated fluxes at the surface of the earth. These models are not able to perfectly simulate atmospheric transport processes, which results in uncertainties in the retrieved surface fluxes (Law et al., 1996(Law et al., , 2008Gerbig et al., 2003; 35 Stephens et al., 2007;Lauvaux et al., 2009;Houweling et al., 2010). One of the dominant sources of transport model uncertainty is the inaccurate representation of the vertical mixing near the surface of the earth and hence the boundary layer height (Stephens et al., 2007;Gerbig et al., 2008). An accurate simulation of the vertical mixing in the boundary layer accurately is critical since it is this part of the atmosphere where most observations are made and that lies closest to the carbon sources and sinks. Hence, misrepresentation of transport in the boundary layer can lead to significant biases in 40 10 modelled tracer mixing ratios as well as the retrieved fluxes (Denning et al. 1996(Denning et al. , 2008Yi et al. 2004;Ahmadov et al. 2009).
Furthermore, a weak observational constraint due to insufficient atmospheric data is also an important factor that causes large errors in retrieved fluxes. Lack of measurements in the atmosphere or an unevenly distributed network of observation sites can result in a poorly constrained regional carbon budget (Gurney et al. 2002). Hence in addition to improved transport 5 models, an enhanced global network of atmospheric measurements is indispensable for more accurate and precise estimation of surface fluxes using inverse modelling.
The current global measurement network of greenhouse gases combines in-situ measurements made by the ground-based stations and satellite instruments measuring total column mixing ratios remotely. While ground-based measurements are highly precise, the main limitation of these measurements is the sparse and uneven spatial coverage (Bousquet et al., 2006; 10 Marquis and Tans, 2008). While parts of Europe and North America dispose of a fairly high data coverage from the surfacebased observation network, the tropical regions of Amazonia, Africa, remote regions of tundra, and Siberia are not adequately covered, sometimes even lacking measurements entirely. In addition, these measurements except those obtained from tall towers, are often not representative of large areas and provide information only at the local scale (Haszpra et al. 1999). Satellites largely overcome this drawback of ground-based measurements since they have the ability to provide 15 information around the world using a single instrument. However, they have their limitations, too, which limits their use for accurate flux estimation using inverse methods. Space borne measurements are still somewhat limited by higher measurement uncertainty and systematic errors, as well as temporal heterogeneity in their sampling (Ehret and  average of the aircraft vertical profiles was discussed by Nakatsuka and Maksyutov (2009). However, so far, the suitability of aircraft vertical profiles and their treatment when using them into inversions, given the transport modelling errors related to vertical mixing has not been addressed.
In this paper we employ synthetic data to investigate theoretical impacts of transport model uncertainties associated with 5 boundary layer height on the fluxes retrieved by using passenger aircraft profiles in an inverse modelling set-up. The synthetic data are generated using a forward run of the TM3 transport model (Heimann and Körner, 2003) and have the temporal and spatial sampling of the measurements made during the MOZAIC project. We examine how closely the posterior flux obtained using the synthetic aircraft measurements as constraint captures the trends and variability in the flux that is used to generate the synthetic data. This allows us to estimate the impact of the inaccurate, simulated vertical mixing.

10
In the second part of this work, we assess the potential of CO 2 observations that will be onboard the IAGOS fleet for constraining the regional carbon budget and reducing posterior flux uncertainties. We further identify the regions that will benefit the most from these measurements. Only the time, location and uncertainty of the measurements are used for the simulations. Since flight routes of commercial aircraft undergo little changes with time, it is reasonable to estimate the constraint that will be brought about by IAGOS aircraft using the sampling from MOZAIC, its predecessor project.

15
The paper is organized as follows: Section 2 describes the methods used that include estimation of the model representation error (Section 2.1), description of the inversion scheme (Section 2.2) and the experimental set-up (Section 2.3). Section 3 presents the results from the simulations and the conclusions are discussed in Section 4.  (Heimann and Körner, 2003). In this study, our model simulations are carried out at a 4°×5° spatial resolution using the ERA-Interim (European Centre for Medium Range Weather Forecasts (ECMWF) Reanalysis-Interim) meteorology.

30
In the following paragraphs, we provide a brief description of the inversion system described in more detail in Roedenbeck et al. (2005). Observed atmospheric mixing ratios C obs , are compared to modelled atmospheric mixing ratios, C mod , based on a prior estimate of the surface fluxes. The modelled atmospheric mixing ratio at a specific location, !"# is obtained by the multiplication of the linear atmospheric transport operator A computed by the transport model with the flux field f and the addition of the initial atmospheric mixing ratio of the transport model at the beginning of the simulation period, C ini The concentration mismatch between observed and modelled values is defined as

40
The aim of the inversion system is to optimize the conditional (a posteriori) probability of the model parameters p with respect to the m, according to Bayes' Theorem. This corresponds to minimising the cost function J defined as: The difference between the modelled C mod and observed C obs , m is used to calculate the observation-based term of a cost function which forms the first term of Eq. (6); taking into account the measurement and model representation errors.
! ! !"#$ !"#$ describes the a-priori flux constraints. The additive constant C subsumes all parameter independent terms, such as those arising from Prob (m) and from the normalization of the distribution. This cost function is minimized iteratively using the adjoint of the atmospheric transport model, as the number of observations and variables to constrain is very large, 10 therefore prohibiting the calculation of an analytical solution. Q c is defined as the error covariance matrix of the atmospheric mixing ratio mismatch. Its diagonal elements represent the combined measurement and modelling errors for each observation i.e. !,!"! = !"# ! + !"#$ ! .. In order to scale the impact of the a-priori constraint on the Bayesian inversion the factor µ is used. It determines the ratio between the a-priori information and data constraints. For µ equal to 0 no prior information is used for minimizing the cost function. For high values of µ the a-priori flux distribution has a high impact on 15 the minimization of the cost function.

30
In the Jena inversion scheme, these error correlations between measurements are accounted for using a data density 'deweighting' scheme. It assigns a weight to the error associated with every measurement computed based on certain predefined criteria. For surface network sites, to avoid a higher impact of the more frequent continuous observations compared to the less frequent flask observations, the data density weighting considers, for every observation, the number of 35 observations N surf within the same week. The total uncertainty for that observation increases by a factor of !"#$ . These N surf measurements have their errors correlated and this error inflation by a factor of !"#$ helps lessen the impact of measurements that are not independent of each other and hence their contribution to the cost function.
The aircraft is a moving platform, which means that the aircraft profiles span a considerable horizontal and vertical distance 40 while making measurements. Therefore, in contrast to a fixed station, the CO 2 concentration along the profile can be expected to de-correlate due to distance, even if taken within a short period of time. We need to incorporate this fact in the 13 de-weighting scheme. Thus, for the aircraft profiles, N aircraft is defined to be the number of measurements that lie in a 4-D (3D space and time) window instead of just those lying within a 1-week interval as used for the surface stations.
Measurements that lie within this 4-D window are taken to have their errors correlated with each other, but taken independent of those that lie outside of it. The 4-D space is defined using the following criteria: 1. Temporal de-correlation length is taken to be 1 week, to be consistent with the treatment of the station data. 5 2. Horizontal spatial de-correlation distance is set at +/-500 km for measurements within the first 700 mbar from the surface and +/-1000 km for the ones above the 700 mbar height.
We use these values of spatial correlation lengths since they are comparable to the grid size that we use for our simulations and sub-grid scale processes cannot be resolved by the transport model. The 700 mbar pressure level represents approximately the maximum of a typical boundary layer height and separates the boundary layer part of the atmospheric 10 column (which is more closely coupled to surface fluxes by fast vertical mixing and hence has a shortened correlation length) from the free troposphere part of the column.

Estimation of model data mismatch error
Model representation error or model-data mismatch can be defined as the mismatch between point observations assimilated in the model and the model simulated spatial averages (Engelen et al. 2002). This error needs to be pre-specified in inversion 15 framework. In our model, we use a representation error that varies with altitude. This is because the mismatch is likely to be higher for measurements that lie closer to the surface while the models perform better for higher altitudes that are not affected as directly by the fluxes. The functional dependency of the mismatch with altitude is computed using data from the CONTRAIL project (Machida et al. 2008).

20
We compute the dependency of the mismatch on altitude using data from the CONTRAIL project (Machida et al. 2008). For this, we compare observations from CONTRAIL against TM3 "reanalysed CO 2 fields" (i.e., atmospheric CO 2 fields simulated by the tracer transport model from surface fluxes previously optimized against CO2 data, such that these fields closely match the data and interpolate in between them). The difference gives the model-data mismatch (mdm) at every level for each airport where CONTRAIL aircraft fly. The vertical resolution of CONTRAIL is 0.25 km, however the statistics 25 have been aggregated onto a coarser 1-km resolution for this analysis. In order to obtain a typical mdm at every level of a profile we use the median of the standard deviation of the mdm at each level across all airports that have at least 20 data points. Figure 1 shows a box plot that is thus obtained. We then fit an exponential curve to the median values at each level: where we obtain a= 2.85 ppm, b= -0.4, and c= 3.18 ppm.

35
Synthetic data at the times and locations of the MOZAIC profiles and the ground network sites are generated to both investigate the impact of boundary layer height errors and assess the impact the addition of aircraft observations has on flux retrievals. For the forward run, we use fluxes from the BIOME-BGC biosphere model (Thornton et al., 2005)  going typically up to an altitude of 9-10 km. We choose not to use the cruise level data for this study because of the fact that most of these measurements are made around the tropopause region, and the model skill in accurately representing the 14 transport at that altitude and linking those measurements via vertical transport to fluxes at the surface is limited (Deng et al. 2015) Since the profiles generated by the forward run of the transport model use the ERA-interim meteorology, the boundary layer height represented by these profiles is that of ERA-interim. We call this the "true" boundary layer height, BLH true. In order to simulate the vertical-mixing-related imperfections in the transport models, we need to generate new profiles with a "wrong" 5 boundary layer height. We do this by modifying these profiles in such a way that they represent a new boundary layer height that is different from BLH true . BLH model denotes this "wrong" boundary layer height. . In order to achieve this we use the approach as implemented by Kretschmer et al., 2012. This approach assumes that errors in the simulated boundary layer height are caused by incorrect vertical distribution of CO 2 in a given atmospheric column, such that the total column concentration remains unchanged. We redistribute the CO 2 between the free troposphere and boundary layer part of the 10 atmospheric column in such a way that the BLH for the profile changes to BLH model. In this study, we use the BLH model  (2000). The surface network consists of 49 sites ( Fig. 2(a)) and the IAGOS observation network consists of measurements from five IAGOS aircraft ( Fig. 2(b)). The prior flux used for the inverse simulations is different and independent from the true flux used to generate the pseudo data and is obtained from the Lund-Potsdam-Jena (LPJ) dynamic global vegetation model (Sitch et al., 2003) 30 In the second part of the study, we estimate the reduction in posterior flux uncertainty brought about by the use of IAGOS vertical profiles as a constraint on the carbon budget. We carry out simulations where the surface-based observation network is augmented by one or more IAGOS aircraft. These simulations do not require the synthetic data that as used in the first part of this study since the inversion system solves for the resultant posterior flux uncertainties based upon only the measurement time, location and the uncertainties of the prior fluxes and the measurements (model-data mismatch). The uncertainty 35 reduction is computed for the monthly mean posterior fluxes aggregated over the TransCom3 land regions (Gurney et al., 2000). It is expressed as the following:

40
It is defined as the extent to which the error in the flux field is modified by the inversion. It is dependent on both the prior uncertainty as well as the observation coverage and is a measure of the accuracy of the posterior fluxes estimated by the inversion.
15 Figure 3 shows the prior uncertainty used by the Jena inversion scheme for the different TransCom3 regions. We focus on the years 1996-2004 because of sufficient data availability from MOZAIC during this period. This period also has some data gaps representing times when one or more aircraft are not flying. This helps give a more realistic quantification of the uncertainty reduction brought about by the use of these data.

Impact of BLH transport model errors on flux retrieval
We analyse monthly posterior fluxes for the TransCom3 land regions and compare them to our "true" flux, which is the flux that is used to generate our pseudo data. We concatenate the time series of the posterior flux for all regions to form a single 10 time series in order to obtain a single diagnostic metric for the whole globe. The statistics for comparison between the different simulations are represented on a Taylor diagram as shown in Fig. 4.
We see that the transport model errors related to vertical mixing, as simulated using the reshuffling method, affect the flux retrieved from measurements made at surface stations differently than those retrieved using aircraft profiles. We observe that there is a large impact of the simulated vertical mixing errors on the flux retrieved using the surface measurements with This is shown by points Ca and Cb being closer to the true flux than points Sa and Sb respectively. It implies that the addition of the aircraft measurements to the surface based network improves the constraint on the carbon budget as compared to the surface network alone.

5
In this section, we evaluate the utility of aircraft measurements of CO 2 from IAGOS for constraining the regional carbon budget. For this the reduction in the uncertainty of the posterior fluxes in relation to the prior fluxes is assessed. It should be noted that while the uncertainty reduction alone may not be robust, similarly computed uncertainty reductions can be robustly compared.   other hand, for regions already well constrained by the surface network, for example North America and Europe, the simulated constraint due to the IAGOS CO 2 measurements is very small (less than 1 %).
We further investigated the constraint due to the aircraft measurements on aggregated spatial scales by examining the change in uncertainty reduction on the addition of pseudo measurements from IAGOS for the Northern hemisphere (30° N to 90° N), Tropics (-30° S to 30° S) and Southern hemisphere (-90° S to -30° S). The zero measurements point on the x-axis of Fig.   40 6(a) and 6(b) indicates the case where only the existing observation network sites have been used into the inversion but no IAGOS profiles have been used. The change in the uncertainty reduction for the northern hemisphere posterior uncertainty increases from 0.5 % when measurements from one simulated IAGOS aircraft are used, to 2 % from measurements from five 17 aircraft. The Tropics, on the other hand, show a comparable trend and increase in the change of flux uncertainty with up to 10 times fewer measurements than in the Northern hemisphere. This difference in uncertainty reduction change in the two regions is likely to be due to the fact that unlike the Northern hemisphere, the tropics are not well constrained by the existing network. Hence, the addition of IAGOS profiles leads to considerable constraint on the surface fluxes. The southern hemisphere (not shown), which is largely ocean, does not gain much from these measurements since they are very few in 5 number and are not sufficient to constrain the region. Hence almost no change is seen in the uncertainty reduction due to aircraft measurements. Thus, we can conclude that the overall impact of IAGOS measurements based upon this sampling is highest for the tropical region. This indicates that the greatest incremental increase in knowledge of fluxes would be gained by instrumenting aircraft flying preferentially tropical routes. It is however noteworthy, that the saturation of posterior uncertainty values as the number of measurements approaches the maximum value, does not imply that there would be no 10 further benefit of adding measurements from more than from five aircraft. The figure is indicative of the information gained solely on aggregated spatial scales and it is very likely that on smaller scales there is added benefit of having more measurements.

4 Summary
Transport models that drive the inversion schemes often have a poor representation of the near surface vertical mixing causing large errors in the retrieved fluxes. In this study, we investigate the impact of such transport model uncertainties on 20 the fluxes simulated using aircraft profiles as constraint in an inverse modelling set up. We focus only on errors in nearsurface vertical mixing. Those due to imperfect representation of other processes like advection and deep convection have not been accounted for. Our simulations show that the flux retrieved using aircraft profiles when the boundary layer height is well known has the same statistical metrics as the flux retrieved when the boundary layer height is erroneous. This shows that posterior fluxes retrieved using aircraft profiles show no sensitivity to the boundary layer height errors as simulated in 25 our experiments. We compare this behaviour of the retrieved flux to that obtained using the surface measurements as constraint. These measurements are usually in the boundary layer part of the atmosphere and therefore we find a much higher mismatch between the flux retrieved using correct versus erroneous boundary layer height in terms of the standard deviation, root-mean-square difference and correlation parameters. In other words, this mismatch shows that the transport model uncertainties related to boundary layer height are very likely to be translated to the posterior flux when surface 30 measurements are used as constraint in the inversion while these errors are not propagated to the retrieved flux when the aircraft profiles are used. This difference in the response of the flux retrieved using the two observation networks is likely to be due to the fact that vertical transport, whose effect we simulate by the redistribution of the tracer mass in the model profile at the location of the airports and surface stations, only redistributes the tracer mass between the boundary layer height and the free tropospheric part keeping the total tracer mass constant. The loss (or gain) of the tracer mass in the profile 35 in the boundary layer part of the profile is compensated by the gain (or loss) in the free tropospheric part of the profile. Since aircraft profile measurements extend all the way from the surface to the free tropospheric part of the atmosphere, the net impact of the complete reshuffled profile remains comparable to that of the original. This effect of redistribution, on the other hand, is not observed for the surface station measurements since they are made within the boundary layer and hence, error in the estimation of the boundary layer height will impact the modelled mixing ratio that constrains the inversion.

40
These results demonstrate the benefit of aircraft measurements over those made by ground-based stations for flux estimation using transport models that cannot resolve the boundary layer perfectly. Although we only account for errors in fluxes due to vertical mixing in our simulations, we can say that flux estimation using aircraft profiles is expected to be more robust when aircraft profiles are used as constraint since the contribution of the boundary layer height uncertainty to the overall transport model error is likely to decline. While improved transport models are an imperative for achieving more accurate estimates of surface fluxes, the potential benefit of aircraft profiles over ground-based measurements, as shown by our simulations, provides a simple and flexible approach of dealing with and eliminating the impact of boundary layer height uncertainties due to vertical mixing and diminishing the overall impact of transport model errors on retrieved fluxes. In addition to this, 5 aircraft profiles would also provide valuable information to drive model development.
Furthermore, on estimating the impact that the CO 2 measurements made onboard the IAGOS fleet are likely to have on the regional carbon budget once they are available, we find that the IAGOS flights will likely provide a strong constraint on regional flux totals. The net CO 2 flux uncertainty reduction using the IAGOS measurements is likely to be highest in the Tropics and the Eurasian temperate regions. These are regions that are not well covered by the existing surface based 10 observation network and hence the addition of aircraft measurements brings about the largest constraint. The change in the uncertainty reduction in these regions is between 7 to 10 percent. In contrast, the European and North American continents, which have good data coverage by the surface, based network show little or no change in flux uncertainty due to added measurements from IAGOS.
We must bear in mind that since the MOZAIC/IAGOS aircraft profiles are measured near the airports, which form areas of 15 high anthropogenic emissions, it is likely that these observations are not truly representative of large areas. This fact has been taken into account, in this study, in a conservative way by estimating the model data mismatch uncertainty using the difference between CO 2 profiles from the CONTRAIL project and reanalysed TM3 fields (Sect. 2.2). However, better approaches for addressing this question of representativeness of aircraft profiles exist, for example, those described by Boschetti et al. 2015. A relatively high model-observation mismatch of 5 ppm at 1 km (as shown in Fig. 1) for CONTRAIL 20 data could partly be a result of applying low-resolution (with respect to plume size of anthropogenic CO 2 transported from large cities near the airports) model and meteorology and thus should be considered as an upper bound on the model data mismatch.
In summary, our results demonstrate the benefit and application of aircraft profile measurements in an inverse modelling 25 framework. In the near future, increased number aircraft profiles of greenhouse gases are expected to be available. Hence, exploiting the potential advantage of this new data stream for inverse modelling studies can go a long way to developing a better understanding of carbon cycle dynamics in hitherto under-sampled regions of the world.