Technical Note : Atmospheric CO 2 inversions on the mesoscale using data-driven prior uncertainties : methodology and system evaluation

Atmospheric inversions are widely used in the optimization of surface carbon fluxes on a regional scale using information from atmospheric CO2 dry mole fractions. In many studies the prior flux uncertainty applied to the inversion schemes does not directly reflect the true flux uncertainties but is used to regularize the inverse problem. Here, we aim to implement an inversion scheme using the Jena inversion system and applying a prior flux error structure derived from a model–data residual analysis using high spatial and temporal resolution over a full year period in the European domain. We analyzed the performance of the inversion system with a synthetic experiment, in which the flux constraint is derived following the same residual analysis but applied to the model–model mismatch. The synthetic study showed a quite good agreement between posterior and “true” fluxes on European, country, annual and monthly scales. Posterior monthly and country-aggregated fluxes improved their correlation coefficient with the “known truth” by 7 % compared to the prior estimates when compared to the reference, with a mean correlation of 0.92. The ratio of the SD between the posterior and reference and between the prior and reference was also reduced by 33 % with a mean value of 1.15. We identified temporal and spatial scales on which the inversion system maximizes the derived information; monthly temporal scales at around 200 km spatial resolution seem to maximize the information gain.


Introduction
The continuous rise of the abundance of greenhouse gases in the atmosphere, especially due to fossil fuel combustion, alerted the scientific community to systematically monitor these emissions.The challenge is not limited only to revealing the spatial distribution of CO 2 sources and sinks on continental scales; it also accurately quantifies CO 2 emissions and their uncertainties on country scales.In situ atmospheric measurements of the atmospheric CO 2 variability combined with inverse atmospheric models are used as an independent method to provide top-down flux estimates for comparison with estimates from bottom-up methods.The latter use local observations (e.g., eddy covariance, EC) and combine these with ancillary data, e.g., soil maps, satellite data, and terrestrial ecosystem models, in order to spatially scale up local flux estimates to larger regions (Jung et al., 2009).Both approaches act complementarily, for optimal comprehension of carbon sources and sinks in a multiple constraint (Schulze et al., 2010) approach and emission inventory assessments.As these inventories are used to deduce national emission estimates, in compliance with the Kyoto Protocol requirements, accuracy is essential.
An atmospheric inverse modeling system provides the link from atmospheric concentrations to surface fluxes.However, the limited number of observations available for solving the system for quite a number of unknowns (spatially and temporally resolved fluxes) makes the inverse problem strongly under-determined.To solve the inverse problem, the system incorporates Bayes' theorem and uses a priori knowledge, provided by biosphere models and emission inventories ac-P.Kountouris et al.: Atmospheric CO 2 inversions on the mesoscale using data-driven prior uncertainties companied by corresponding uncertainty estimates.Then, the system optimizes the a priori fluxes by minimizing the difference between model predictions and observed concentrations.For the current study only the biospheric fluxes were optimized, and emissions from fossil fuel combustion are assumed to be known much better, as is the case in almost all published regional inversion studies.Inversion systems have been extensively used to derive spatiotemporal flux patterns on global (e.g., Enting et al., 1995;Kaminski et al., 1999a;Gurney et al., 2003;Mueller et al., 2008) and regional scales (e.g., Gerbig et al., 2003a;Peylin et al., 2005;Lauvaux et al., 2012;Broquet et al., 2013).
The challenge in regional inversions is to reconstruct, at high resolution, the spatiotemporal flux patterns, usually of the net ecosystem exchange (NEE).For that purpose currently deployed global or regional inverse modeling schemes use different state spaces (i.e., the set of variables to be optimized through the inversion process).Peters et al. (2007) split the domain of interest into regions according to ecosystem type.Subsequently, fluxes are optimized by using linear multiplication factors to scale NEE for each week and each region.The pitfall of this system is that a zero prior flux has no chance to be optimized and remains zero.Zupanski et al. (2007) divided the NEE into two components, i.e., the gross photosynthetic production (GPP) and ecosystem respiration (R).Then multiplicative factors for the gross fluxes were derived on the grid scale, under the assumption of being constant in time.A step further made by Lokupitiya et al. (2008) used the same approach but with an 8-week time window, allowing for temporal variations in the multiplicative factors.A different approach introducing the carbon cycle data assimilation system (CCDAS) was implemented by Rayner et al. (2005) and Kaminski et al. (2012) by constraining global parameters within a biosphere model able to control surface-atmosphere exchange fluxes, against observed atmospheric CO 2 mole fractions, instead of the fluxes themselves; this CCDAS approach also allows for nonlinear dependencies of the fluxes on the parameters.Lauvaux et al. (2012) used a Bayesian approach based on matrix inversion, separately optimizing day and nighttime fluxes on a weekly timescale for a limited simulation period and domain.An attempt to assess which of these approaches better reproduces NEE was made by Tolk et al. (2011).This study investigated the impact of different inversion approaches via a synthetic experiment utilizing an ensemble Kalman filter technique and the same transport model for all cases.They found that inversions that separately optimize gross fluxes within a pixel inversion concept perform better at reconstructing the NEE, although they fail to obtain the gross fluxes.Taking into consideration these findings, we also choose the pixel-based inversions but optimizing the net biogenic fluxes as we are mainly interested in the total carbon flux budget.
Introducing proper prior flux uncertainties is crucial for meaningful posterior estimates, as these uncertainties weight the prior knowledge between different locations and times, as well as with respect to the data constraint.The uncertainties have the form of a covariance matrix and can be categorized into uncertainties of the prior fluxes and uncertainties of the observational constraint, which includes measurement and transport model uncertainties.While the measurement uncertainty in the observational constraint is usually defined with the main diagonal of the covariance matrix representing the uncertainty of the observations and the model at a specific time and location, our knowledge for the prior uncertainty is limited, especially regarding temporal and spatial correlations that effectively control the state space.Early inversions assumed fully uncorrelated flux uncertainties (Kaminski et al., 1999b), while spatial and temporal correlations were used later by Rödenbeck et al. (2003), who investigated the autocorrelation of monthly CO 2 fluxes calculated using a set of terrestrial and ocean models.In Rödenbeck (2005), spatial correlations for land fluxes were assigned to a state space of 4 • latitude × 5 • longitude resolution.Slightly different correlation length scales were considered for the meridional and zonal direction, assuming that the climate of the latter varies less than of the former.Flux correlations on land were determined by assuming an exponential pulse response function with a length of 1275 km.This leads to correlation lengths approximately 2 times larger compared to the pulse length.Typically the spatial correlations are considered more as a tool to regularize the inverse problem, rather than as an uncertainty feature.Schuh et al. (2010) obtained correlation lengths from Rödenbeck et al. (2003) but with a much higher state space resolution of 200 km.Lauvaux et al. (2008) neglected the spatial correlations to enlarge the impact of the data.Carouge et al. (2010a) inferred spatial and temporal correlation lengths based on the agreement between posterior and "true" fluxes in the framework of a synthetic experiment, where the "truth" is known.A different approach was used in the study of Peters et al. (2007) in which they interpret the length scale from a climatological and ecological perspective and use it to spread information within regions, which the network is incapable of constraining.In particular, correlations are applied such that the same ecosystem types in different TransCom regions (basis function regions; see also http://transcom.project.asu.edu/transcom03_protocol_basisMap.php)decrease exponentially with distance (L = 2000 km), and thus a coupling between the behavior of the same ecosystem is assumed.Ad hoc solutions have also been used, assuming that daily fluxes have smaller correlation lengths than monthly fluxes, which are used by other studies (Peylin et al., 2005).More specifically, Peylin et al. (2005) assumed 500 km for daily temporal resolution compared to the much larger correlation lengths used by Rödenbeck for monthly flux resolution.Michalak et al. (2004) implemented a geostatistical approach to describe the prior error structure.Specifically, the prior error covariance describes which degree deviations of the surface fluxes from their mean behavior at two different locations or times are expected to be correlated as a function of the distance in space or in time.They simultaneously estimate posterior fluxes as well as parameters controlling the modeldata mismatch uncertainty and the prior flux uncertainty, including variance as well as spatial and temporal correlation lengths.Although this approach may be considered as an objective way to infer spatial and temporal correlation lengths, it forces the structural parameters of the error covariance to be statistically consistent with the atmospheric data.In other words, flux parameters are optimized from atmospheric concentration data, and they are forced to have values that can reproduce the atmospheric data.In a similar approach, Ganesan et al. ( 2014) and Lunt et al. (2016) applied a hierarchical Bayesian model using atmospheric concentrations, to estimate both fluxes, and a set of hyper-parameters (e.g., mean and SD of a priori emissions probability density function (PDF) as well as model-measurement SD and autocorrelation scales).However, the covariance parameters depend on the atmospheric data and on the transport model (Michalak et., 2005).Those studies are focused on the sulfur hexafluoride (SF 6 ) and methane (CH 4 ) and a direct comparison is not possible.However, Michalak et al. (2004), who applied the same approach to CO 2 , reported spatial decorrelation lengths of around 1000 km, which is 1 order of magnitude larger than our estimates.In addition, Lunt et al. (2016) report that due to the computational costs, they performed the inversion with no temporal dependence, assuming that the fluxes are constant over a fixed time period.EC stations can provide a more direct method to infer spatial and temporal flux correlations.Chevallier et al. (2006) and Chevallier et al. (2012) introduced autocorrelation analysis of the residual between fluxes simulated by biosphere models and fluxes measured by EC to infer spatial and temporal error correlations.The derived error statistics were implemented in a regional CO 2 inversion by Broquet et al. (2013).
Daily NEE flux residuals from model-data comparisons showed temporal correlations of up to 30 days but very short spatial correlations of up to 40 km (Kountouris et al., 2015).In such a case, the a priori integrated uncertainty over time and space, e.g., annually and Europeanwide domain integrated, according to the error propagation will be exceptionally small.For example, a variance of 1.82 µmole m −2 s −1 (from model-data differences) combined with the abovementioned correlation scales yields an uncertainty of 0.12 GtC yr −1 for the total flux over Europe.This value is significantly smaller than the assumed uncertainty, which is typically used by the inversion systems.For comparison, we refer to studies from Rivier et al. (2010) and Peylin et al. (2005) (for a slightly larger domain than ours) in which an a priori uncertainty of approximately 1.4 and 1 GtC yr −1 , respectively, was used.Further, Peylin et al. (2013) found that the variance of the posterior NEE fluxes integrated over the European domain among 11 global inversions is also 3 to 4 times larger (0.45 GtC yr −1 ).Although it is not yet entirely clear what would be the "cor-rect" value for the prior uncertainty, it seems that in our study it should be increased not only to give enough flexibility to the system to adjust but also to be at least comparable with other posterior uncertainty estimates.A typical method is to inflate the spatiotemporal component by accordingly scaling the prior error covariance.In a study by Lauvaux et al. (2012) two correlation lengths were used at 300 and 50 km, and for the shorter scale the uncertainty was inflated by increasing the RMS of the prior error covariance.The model-data analysis (Kountouris et al., 2015) justifies neither the use of large correlation scales nor largely inflated variances that exceed the model-data flux mismatches; however, it is consistent with an additional overall bias error that cannot be captured from the estimated spatiotemporal error structure.Hence, an appropriate approach would be to introduce two adjustable terms into the inversion system.One term to reflect the data-derived error structure without error inflation (prior error covariance matrix that describes the spatiotemporal component) and one term to represent a bias component.To the best of our knowledge such an approach has not yet been used in inversion systems.
This study primarily aims to use the information extracted from the model-EC data residuals (spatiotemporal error structure) to define a data-driven error covariance rather than simply assuming one, adopting a conservative one or an expert knowledge solution.For that, we implement our previous methodology and findings regarding the prior uncertainty to atmospheric inversions following Kountouris et al. (2015).As explained above, we implement two uncertainty terms: the first one to reflect the true spatiotemporal error structure and the second term to reflect a bias term.We use the Jena inversion system (Rödenbeck, 2005;Rödenbeck et al., 2009) for the regional scale consisting of a fully coupled system as described in Trusilova et al. (2010) that couples the global three-dimensional atmospheric tracer transport model TM3 (Heimann and Körner, 2003) and the regional stochastic Lagrangian transport model STILT (Lin et al., 2003).This scheme allows the retrieval of surface fluxes at a much finer resolution (0.25 • ) compared to global models.The first part of this study details the methodology of the prior error implementation and evaluates the system's performance through a synthetic data experiment.The system evaluation is an extension of Trusilova et al. (2010) in which the evaluation was limited to the observation space only.We extend that to the flux space by comparing flux retrievals on various spatial and temporal scales against synthetic true fluxes.Station locations and observation times (including gaps) were created as in the real observation time series presented in a follow-up study (Kountouris et al., 2018).That way we can use the synthetic experiment to evaluate to what extent we can trust the results, if a real-data inversion is performed.In the follow-up study (Kountouris et al., 2018) the regional inversion system is applied to real observations of atmospheric CO 2 mole fractions from a network of 16 stations.This paper is structured as follows.In Sect. 2 we present the inversion scheme and introduce the settings of the atmospheric inversions.In Sect. 3 we present the results from a synthetic inversion experiment aimed to assess the prior error setup, considering it as a step towards atmospheric inversions using real atmospheric data with an objective, state-ofthe-art prior error formulation.Discussion and conclusions follow in Sects.4 and 5, respectively.

Inversion scheme
The Jena inversion system (Rödenbeck 2005;Rödenbeck et al., 2009) was used for the current study.The scheme is based on the Bayesian inference and uses two transport models, the TM3 model (Heimann and Körner, 2003) for global simulations and the STILT model (Lin et al., 2003) for regional simulations.The advantage of the system is that it combines a global transport model with a regional one without the need of a direct coupling along the boundaries.The global transport model is used to calculate fluxes from the far field (outside of the regional domain of interest), and subsequently this information can be used to provide lateral boundary information for the regional model.The primary input of the system is the observed mixing ratios c meas .This vector contains all measured mixing ratios at different times and locations.The modeled mixing ratios c mod given from a temporally and spatially varying discretized flux field f are computed from an atmospheric transport model and can be formally expressed as where c ini is the initial concentration and A the transport matrix that maps the flux space to the observation space.For the regional domain, the transport matrix A has been precomputed using the STILT transport model.The system calculates the modeled concentrations when and where a measurement exists in the c meas vector.The initial concentration is assumed to be well mixed and remains constant throughout the simulation.The assumption of the well mixed initial concentration is considered to be valid since any spatial structure would be lost during the spin-up period.
In the following, we briefly describe the inverse modeling approach.For more details the reader is referred to Rödenbeck (2005).
In grid-based atmospheric inversions the number of unknowns (spatially and temporally resolved fluxes) is larger than the number of measurements (hourly dry mole fractions at different sites), making the inverse problem ill-posed.In the Bayesian concept this can be remedied by adding a priori information.This information can be written as where f fix is the a priori expectation value of the flux, matrix F contains all the a priori information about flux uncertainties and correlations (implicitly defining the covariance matrix), and p is a vector representing the adjustable parameters.The parameters p are uncorrelated with zero mean and unit variance.This flux model represents just a different way to define the a priori probability distribution of the fluxes, than the traditional way in which the a priori error covariance matrix is explicitly specified.The cost function describing the observational constraint is expressed as where Q c is the observation error covariance matrix.This diagonal matrix weights the mixing ratio values considering measurement uncertainty, location-dependent model uncertainty and a data density weighting.The data density inflates the uncertainty over weekly intervals by a factor of the square root of the measurement number within a given time interval.This ensures that the higher number of data from continuous measurements compared to the data from flask measurements would not lead to a considerably stronger impact of these corresponding sites (Rödenbeck, 2005).This can also be formally interpreted as a temporal correlation scale, which ensures that the model-data mismatch error is not independent within a week, corresponding roughly to timescales of synoptic weather patterns.
The inversion system seeks to minimize the following cost function that combines the observational constraint (Eq. 3) and the prior flux constraint The minimization of the cost function is performed iteratively with respect to the parameters p by using a conjugate gradient algorithm with re-orthogonalization (Rödenbeck, 2005).

A priori information and uncertainties
The a priori CO 2 flux fields were derived from the Vegetation Photosynthesis and Respiration Model, VPRM (Mahadevan et al., 2008).VPRM uses ECMWF (European Centre for Medium-Range Weather Forecasts) operational meteorological data for radiation (downward shortwave radiative flux) and temperatures (T2m), the SYNMAP land cover classification (Jung et al., 2006), and EVI (enhanced vegetation index) and LSWI (land surface water index) derived from MODIS.Model parameters were re-optimized for Europe using eddy covariance measurements made during 2007 from 47 sites (a full site list is given in Kountouris et al. (2015); we excluded some sites due to insufficient temporal data coverage or lack of representativeness).To mediate the impact of data gaps, a data density weighting was introduced that takes into account the coverage of different times of the day (using 3 h bins) in the different seasons.Optimized parameters are shown in Table 1.The NEE on an hourly scale and at 0.25 • × 0.25 • spatial resolution for 2007 was simulated with the optimized parameters for the European domain shown in Fig. 1.The domain-wide aggregated biospheric carbon budget for 2007 derived that way from VPRM was found to be −0.96GtC yr −1 (i.e., uptake by the biosphere).Note that without the density weighting an even stronger flux of −1.35 GtC yr −1 was derived, indicating the importance of proper treatment of data gaps by either gap filling or by the inclusion of weights.
Additionally, biogenic CO 2 fluxes were simulated with the BIOME-BGC model, specifically its global implementation as GBIOME-BGCv1 (Trusilova and Churkina, 2008) at the same 0.25 • × 0.25 • spatial and hourly temporal resolution.The purpose of the second flux field is to provide a perfectly known flux distribution as true fluxes that can be used to generate synthetic observations.The BIOME-BGC model is a terrestrial ecosystem process model, which requires only standard meteorological data such as daily maximum-minimum temperature, precipitation, incoming shortwave solar radiation, vapor pressure deficit, and the day length.How accurate the modeled fluxes are is difficult to say since this would require a validation against observed fluxes from EC stations.Nevertheless, biospheric models still suffer from large uncertainties.The remarkably diverse results between models confirm how uncertain models are (see Friedlingstein et al., 2014).However, in the current experiment the accuracy of the true fluxes is not of concern since we aim only to create a synthetic flux field that we know perfectly.
The a priori flux in a real-data inversion would have three components including fossil fuel and ocean fluxes: (5) We note that for the synthetic case the last two a priori terms are set to zero.Similarly the deviation term (the data-derived correction to the a priori fluxes) of the flux model (Eq.6) consists of the terms referring to NEE, fossil fuel and ocean fluxes.Here in the synthetic case the last two terms are set to zero (i.e., they are not optimized).
The total prior uncertainty was chosen according to the mismatch between VPRM and BIOME-BGCv1, calculated as the annual and domain-wide integrated flux mismatch.Prior fluxes and the fluxes representing the synthetic truth are strongly different (−0.96 and −0.31 GtC yr −1 for VPRM and GBIOME-BGCv1, respectively).The error structure used for the synthetic study is estimated according to the method applied in Kountouris et al. (2015).Time series of daily fluxes were extracted for both biosphere models at grid cell locations where an EC station exists.Fluxes from GBIOME-BGCv1 can also be regarded as synthetic EC fluxes.Then spatial and temporal autocorrelation analysis was performed on the daily model-model flux residuals, yielding a spatial correlation length scale of 566 km and a temporal correlation scale of 30 days.We note that the current study does not directly make use of the error structure derived in Kountouris et al. (2015) since this is applicable for real-data inversions.Instead we use the same methodology to derive the actual model-model error structure since here we perform a synthetic data inversion, exploring, amongst other things, the accuracy of this method.
The EC station locations used for this analysis were exactly the same as in Kountouris et al. (2015), ensuring similarity in the derivation of the error structure for the synthetic data inversions.Following this approach, apart from the similarity, we also ensure that results from the synthetic experiment would be informative for a real-data inversion by using exactly the same information to characterize the prior uncertainties.Of note is that for the synthetic data inversions, prior fluxes from VPRM model were not optimized against GBIOME-BGCv1 true fluxes.P. Kountouris et al.: Atmospheric CO 2 inversions on the mesoscale using data-driven prior uncertainties The implicitly defined prior error covariance matrix contains diagonal elements of 1.45 µmol m −2 s −12 that reflect the variance from model-model flux mismatches at the 50 km spatial resolution of the state space.Exponentially decaying spatial correlations were implemented with a correlation scale of 766 km at the zonal direction and 411 km at the meridional direction, roughly corresponding to the 566 km correlation scale yielded from the model-model residual autocorrelation analysis and preserving the same zonal / meridional ratio as in the global inversion.Temporal autocorrelation was set to 31 days, which is consistent with the Kountouris et al. (2015) analysis.These scales result in an annually integrated and domain-wide uncertainty for the spatiotemporal component (E st ) of 0.44 GtC yr −1 .We chose two different approaches to increase the prior uncertainty on a domain-wide and annually integrated scale such that it matches the mismatch of 0.65 GtC yr −1 between the two biosphere models.First we inflate the error by scaling the error covariance matrix; this case is referred to as base case B1 hereafter.The second approach, referred to as scenario S1, could be considered as a more formal approach: we introduce an additional degree of freedom to the inversion system by allowing for a bias term.This term is spatially distributed according to the annually averaged VPRM respiration component and is kept constant in time.The idea behind the implementation of this term is that on large scales a bias might exist that cannot be captured in the model-data residual autocorrelation analysis (EC measurements are representative on scales of ∼ 1 km).This assumption avoids the artificial inflation of the uncertainty on a pixel scale and restricts the pixel-to-pixel corrections to be statistically consistent with the actual error structure.The bias shape selection (respiration shape) was preferred over the NEE fluxes, as otherwise a priori neutral pixels (with zero NEE) could not be bias corrected.Further, allowing bias to have a spatial shape might be sound since regions with stronger fluxes might also be more biased.The error E BT of the bias component was adjusted such that the total prior error E tot for annually and domain-wide integrated fluxes matches the targeted total uncertainty: This resulted in an overall uncertainty E tot of 0.65 GtC yr −1 , which is identical to the mismatch between the two biosphere models.

State space
The inversion system optimizes additive corrections to 3hourly fluxes in a sense that the posterior flux estimate can be given by the sum of a fixed a priori term (first term of the right-hand side in Eq. 8) and an adjustable term (second term in Eq. 8).The latter has a priori a zero mean.The biogenic fluxes can be defined as follows: where f sh is a shape function that defines the adjustable term.
The spatial and temporal correlation structures of the uncertainty are described by the pulse response functions G xycor and G tcor , respectively.The term p inv contains the adjustable parameters that a priori have a Gaussian distribution with zero mean and unit variance.
Note that the a priori error covariance matrix (Q f,pr ) does not explicitly appear in the inversion, but it is included through the second term in Eq. ( 8).According to this formulation the columns of G tcor and G xycor contain the spatiotemporal extents of the individual NEE pulses (range of values between 0 and 1), and the diagonal matrix f sh (x, y, t) contains the pixel-wise a priori uncertainties.These uncertainties were chosen to be flat (constant) in space and time.For more detailed information the reader is referred to Rödenbeck (2005).
For the S1 case the posterior flux estimates can be expressed by adding the optimized bias flux field to Eq. ( 8) The bias term f BT follows a flux shape (here we used annually averaged respiration, with no temporal variation).Following Rodgers (2000), the posterior flux uncertainties are contained in the covariance matrix of the posterior probability distribution, which can be estimated from Eq. ( 10): where Q c is the measurement error covariance matrix.

Observation vector and uncertainties
The observation vector c meas contains mixing ratio observations at all site locations and sampling times.A common procedure to derive synthetic observations is to create a true flux field by adding some error realizations to the a priori fluxes (Schuh et al., 2009;Broquet et al., 2011) or to perturb the resulting synthetic observations (Wu et al., 2011).
Instead, for the current study we use a different biosphere model, the GBIOME-BGCv1 model, to derive biogenic CO 2 fluxes on an hourly scale.Such an approach is also used by Tolk et al. (2011).Then a forward transport model run was Monthly data coverage plot for the atmospheric stations used in the regional inversions.The left column shows the code name and the right columns show the station class and the assigned uncertainty in units of ppm."C" stands for continental sites near the surface, "T" for continental tall towers, "S" for stations near shore, "M" for mountain sites, "MU" for mountain sites with diurnal upslope winds and "UP" for urban pollutant.
performed to create synthetic mixing ratios at hourly resolution for each station location.We note that the synthetic data were derived without adding error realizations.This choice of using two different biosphere models for deriving the a priori and the true fluxes is expected to increase the realism of the synthetic data study, given the fact that the real spatiotemporal flux distribution is highly unknown (though the model-to-model difference may not accurately reflect the model errors either).The use of two different biosphere models assures that the prior simulated CO 2 time series will never match with the pseudo-observations (the known truth).For the synthetic study, observations were created for the same station locations and observation times as in the real observation time series that are used in the follow-up study (Kountouris et al., 2018).An overview of the atmospheric stations is given in Table 2 and Fig. 1.The data coverage per station is shown in Fig. 2.Only daytime observations were considered (11:00-16:00 LT) since the transport model is expected to perform worse at night when a stable boundary layer forms.
An exception is made for mountain stations that measure the free troposphere, where only nighttime observations (23:00-04:00 LT) were considered, as this time can be better represented by the transport model (Geels et al., 2007).In total, 20 273 hourly observations from the year 2007 were used.The model-data mismatch uncertainty associated with each measurement is expressed as a diagonal covariance ma-trix and contains measurement errors and errors from different components describing the modeling framework (i.e., model errors due to imperfect transport, aggregation errors) (Gerbig et al., 2003b).For the current study, all sites are classified according to their characteristics (e.g., tall tower, mountain sites), and uncertainties were defined depending on the site class (Fig. 2, legend on the right).The uncertainties are considered as representative for current inverse modeling systems.Although the measurement error covariance is a diagonal matrix, transport error correlations might be present.Although we do not explicitly introduce off-diagonal terms in the measurement error covariance matrix, we do consider for temporal correlations via a data density weighting function that inflates the uncertainty.(see Sect. 2.1 and more information in Rödenbeck, 2005).

Atmospheric transport
For the synthetic data study only the regional atmospheric model STILT was used to create the observations with a forward run and to perform the inversion.This was feasible since the synthetic CO 2 observations are only influenced by fluxes occurring within the domain of interest; hence, global runs to retrieve boundary conditions at the edge of the domain of interest are not necessary.The transport matrix for the regional inversions was generated in the form of precalculated footprints (sensitivities of atmospheric observations to upstream fluxes) at 0.25 • spatial and hourly temporal resolution for the full year 2007.STILT trajectory ensembles were driven by ECMWF meteorological fields (Trusilova et al., 2010) and computed for 10 days backwards in time, ensuring that nearly all trajectories have left the domain of interest.
With respect to the assumed model height, STILT uses surface elevation maps from ECMWF (European Centre for Medium-Range Weather Forecasts) with a resolution of 0.25 • ×0.25 • .As the model orography represents an average over the whole grid cell, it is, in particular at steep mountain sites, significantly smaller compared to the real orographic height at the station location.In order to better represent the location of the station in the large-scale flow, a model height that more closely represents the real height (a.s.l.) of the measurements is usually assumed.However, using exactly the measurement height (a.s.l.) in the model would decouple the CO 2 signal too strongly from the surface fluxes and hence lead to a systematic underestimation of the surface influence on the concentrations (Geels et al., 2007).A compromise was reached by adjusting the model height (above ground) by half the distance between the model orographic height and the real station height.
Table 2. Information on the stations used for the regional inversions.Same network applied for the synthetic data inversions and the real-data inversions in Kountouris et al. (2018).In the first column the term "type" stands for continuous (C) or flask (F) data.3 Results

Site code/type
The purpose of the synthetic study is to evaluate the system setup with a realistic approach.To evaluate the ability of the system to retrieve the synthetic true fluxes, we visualize spatially distributed fluxes and we study spatially integrated (domain and national scale) as well as temporally (annual and monthly scale) integrated fluxes.

CO 2 mole fractions
A comparison of true and modeled CO 2 dry mole fractions from forward runs of the optimized fluxes can reveal the goodness of fit, realized through the optimization process.Such a comparison is presented in Fig. 3 for the Schauinsland (SCH) continuous station.Both B1 and S1 inversions significantly reduce the misfit between the synthetic (truth) and the a priori mole fractions.As expected from the optimization (i.e., minimization of the cost function), the RMSD between the prior and posterior from the true time series for all stations (Table 3) shows an average reduction of around 74 and 76 % for the S1 and B1 inversions, respectively.Prior correlations (prior vs. true dry mole fractions) have an averaged value of 0.46, which is increased to 0.93 for both inversions.Significant differences between the two inversions were not found apart from a slightly larger decrease in the RMSD for the B1 case. Figure 4 summarizes the capability of the inversions to capture the true signal at each station location in the form of a Taylor diagram, indicating that the inversions showed a significant increase in the correlation for all sites.Further, the variance of the modeled time series is significantly closer to the variance of the true signal.To estimate the goodness of fit we consider the station-specific χ 2 c values (Eq.11) following Rödenbeck et al. (2003).Here we use 7-day aggregated residuals instead of hourly residuals to match the temporal scale of 1 week of the observation error.With 68 % probability, the modeled dry mole fractions should be within the ±1σ range from the observed mole fractions.This is equivalent to the requirement that the dry mole fraction part of the cost function defined as the sum of hourly squared differences, divided by the uncertainty interval and the number of observations n (Eq.11), should be close to unity.Values smaller than 1 are found for most of the stations, with a mean value of 0.28 and 0.32 for the B1 and S1 cases, respectively, suggesting a good fitting performance for all stations and for both inversions.The results are comparable with those found in the Rödenbeck et al. (2003) study.Another important aspect is the reduced χ 2 r metric, which we use to assess the model performance.By definition the reduced χ 2 r can be obtained by dividing the squared residuals of optimized dry mole fractions minus observed dry mole fractions by the squared specified uncertainties.This is also equivalent to 2 times the cost function at its minimum divided by the number of degrees of freedom (effective number of observations) (Tarantola, 2005): Again, a correct balance should be close to unity.The reduced chi square (Eq.12) was found to be 0.21 for both cases, indicating that the error variance is overestimated, making the error assumption rather conservative.

Flux estimates and uncertainties
In flux space, we evaluate the inversion performance by comparing the retrieved flux estimates against the synthetic fluxes (true) on different temporal and spatial scales: annually and monthly integrated fluxes, domain-wide and on a country scale.In particular we are interested in capturing the true fluxes down to country scale.For that we assess monthly posterior retrievals, which we compare to reference data (true fluxes), country aggregated, using a Taylor diagram.This diagram provides a concise statistical summary of how well patterns match each other in terms of their correlation and the ratio of their variances.
The spatial distributions of the annual biosphereatmosphere exchange fluxes for the prior, the known truth and the posterior cases are presented in Fig. 5.Note that annual fluxes between the two biosphere models used for prior fluxes and true fluxes are substantially different.The inversion significantly adjusts the spatial flux distribution mainly in central Europe and in southern Scandinavia, where a denser atmospheric network exists.The absolute annual mean difference in fluxes (|mean(true − prior)| and |mean(true − posterior)|) is greatly reduced from 70.8 g C m −2 yr −1 to 14.7 and 24.6 g C m −2 yr −1 for the B1 and S1 inversions, respectively.Detailed patterns, however, are not well reproduced: the fraction of explained spatial variance in the true fluxes (measured as squared Pearson correlation coefficient) decreases from the prior (0.17) to the posterior (0.07 and 0.06 for the cases B1 and S1, respectively).When evaluating this on monthly scales, the fraction of explained spatial variance increases in the posterior estimates compared to the prior for winter months from around 0-15 % to about 15-50 %, while during the growing season a decrease from around 10-35 % to about 0-34 % is typically found.The accumulated footprint of the atmospheric network is shown in Fig. 6, clearly indicating the strongest   bution.When separating the spatiotemporal component from the bias component (in S1 case), we can identify differences between the two inversions.Significant deviations of the spatial flux distribution between the spatiotemporal components were found: the spatiotemporal component in the S1 case has a domain-wide annual flux correction of 0.39 GtC yr −1 (prior -posterior) while the corresponding term in the B1 case has a correction of 0.78 GtC yr −1 .Nevertheless, SDs of the corrections with respect to the true spatial flux distribution (true -posterior) were found to have no significant difference (6.88 × 10 −5 and 7.38 × 10 −5 GtC yr −1 cell −1 for S1 and B1, respectively).We do not observe any strong correction in the southeastern part of Europe as it cannot be "seen" from the atmospheric network due to the distance to the observing sites and the prevailing westerly winds.This could also be inferred from the flux innovation plots (see Fig. 5) defined as the difference between prior and posterior fluxes.Only very small or even no corrections occurred in this area.
We are specifically interested in the ability of the inversion system to capture integrated fluxes over time and space.Figure 7 shows an overview of the domain-integrated fluxes on monthly and annual scales.Despite the remarkably larger a priori (VPRM) sink compared to the synthetic truth (GBIOME-BGCv1) during the growing season, both inversions, with and without the bias term, produce posterior flux estimates that fully capture the true monthly and annually integrated fluxes.While the monthly posterior estimates give no clear evidence on which inversion performs better, retrievals on an annual scale slightly favor the inversion without the bias term (B1 case).A difference was observed in the prior uncertainties between the two inversions.While both were scaled to have the same prior annual uncertainty, the B1 inversion has systematically larger prior monthly uncertainties than the S1 as a result of the inflated spatiotemporal component of the prior error covariance.Posterior uncertainties were found to be similar and include or are close to including (S1 case) the true flux estimates.The uncertainty reduction for annually and domain-wide integrated fluxes, defined as the difference between prior and posterior uncertainties normalized by the prior uncertainty, was found to be 73 and 69 % for S1 and B1, respectively.Note that whilst the prior uncertainty refers only to the flux space, the posterior uncertainty depends on the uncertainty of prior fluxes, measurements and transport.
In order to assess how well the posterior estimates agree with the true fluxes, RMSD between true and posterior monthly integrated gridded fluxes was computed (Table 4).Both B1 and S1 inversions show a similar reduction in the RMSD values compared to the prior.The same picture emerges for the annually integrated fluxes.
Of particular interest is the performance of the system on a regional scale, specifically at national level.Figure 8 shows monthly fluxes for selected European countries, including the prior, true and posterior estimates with the corresponding uncertainties.Both error structures show a similar performance.
Despite the large prior misfit, the system succeeded in retrieving monthly fluxes at country level.Better constrained regions mainly located in central Europe show the ability to broadly capture the temporal flux variation on a monthly scale.Figure 9 summarizes in a Taylor diagram the inversion performance for the S1 case and for each EU-27 country, showing the improvement of monthly and country aggregated fluxes (a perfect match would be if the head of the arrow were to coincide with the reference point marked as a green bullet).It is worth mentioning that for regions that are less constrained by the network, such as Great Britain, Spain, Poland and Romania, the inversions also still improved the posterior estimates compared to the prior estimates (see also Fig. 9).

Evaluation with synthetic eddy covariance data
In order to investigate the potential of using EC measurements for evaluating the retrieved CO 2 fluxes, monthly fluxes from the prior (VPRM), the truth (GBIOME-BGCv1) and the posterior for cases B1 and S1 were extracted at the grid cell locations where EC stations exist, using the same 53 sites as in Kountouris et al. (2015).The corresponding fluxes were then aggregated over all sites, using a weight that compensates for the asymmetry between number of flux towers for specific vegetation types and the fraction of land area covered by the specific vegetation type.Prior fluxes show a systematically larger uptake compared to the truth, predominantly during the growing season with maximum differences of 0.8 g C m −2 day −1 (Fig. 10).Posterior estimates for both cases captured the magnitude of the true fluxes, with maximum differences of around 0.3 g C m −2 day −1 during June-July.A significantly larger correction is apparent during spring and summer compared to winter and fall.

Performance in flux space
Results from the synthetic experiment showed the strengths but also the weaknesses of the system to retrieve the true spatial flux distribution.Although the error structure applied to this experiment was statistically coherent with the mismatch between prior and true fluxes, we note a limited ability of the current atmospheric network to retrieve fluxes on local scales.For coarser spatial scales (country level) the carbon budget estimates in the synthetic inversion showed a quite good performance on monthly and annual temporal scales.Further, we observed an average reduction of the monthly uncertainties of 65 % for the B1 case and 64 % for the S1 case.In combination with the fact that the flux estimates reproduce the truth within the posterior uncertainties, this gives us confidence in the accuracy of our estimates.
The current study does not focus on the transport error quantification but rather includes it as diagonal elements in P. Kountouris et al.: Atmospheric CO 2 inversions on the mesoscale using data-driven prior uncertainties the measurement error covariance, which is typical in atmospheric inversions.The chi square values confirm that there is no underestimation of the uncertainties.We note though that erroneous flux estimates are likely to be estimated, es-pecially on finer spatial scales on which the transport model is not able to resolve the real transport (e.g., individual eddys, complicated terrain).However, for coarser spatial scales transport models are expected to perform better, which seems to be in line with comparisons of the prior and posterior with the true flux estimates, which better agree on largely aggregated scales.
Prior error correlation in time and space limits the scale on which information can be retrieved from the inversion.The spatial correlation of several hundreds of kilometers implies that fluxes on scales smaller than this cannot be significantly improved by the inversion, as the results clearly showed.To assess this more quantitatively, the spatial correlation between a priori or retrieved and true monthly fluxes is calculated for different spatial aggregation scales (starting at 0.25 • , fluxes were aggregated to 0.5 and then in 1 • steps up to 8 • ).Results shown in Fig. 11a indicate a nearly monotonous increase in the spatial correlation of prior and posterior fluxes with increasing aggregation scale.The additional explained variance brought about by the inversion, i.e., the difference between posterior (red or blue line) and prior (grey line) flux correlation (r square) with the truth, starts at low values around 0.1 and reaches values around 0.2 for scales larger or equal to 2 • .Similarly, the spatial correlation between a priori and true fluxes for a given spatial aggregation of 2 • but for different temporal aggregation scales ranging from 1 to 128 days (Fig. 11b) shows a continuous increase from about 0.23 to 0.42 (r square), while the spatial correlation between retrieved and true fluxes only varies slightly between 0.4 and 0.53 (Fig. 11b, red and blue lines).Here, the additional spatial variance explained by the retrieved fluxes is largest on around monthly timescales (differences between prior and posterior r square around 0.2), while on seasonal scales this additional explained variance is only around 0.1.Overall, this analysis confirms that there are preferred spatial and temporal scales on which the inversion retrieves the flux distribution best and where thus most information is gained.This is not dependent on whether or not a bias term is included in the state vector, as results for cases B1 and S1 do not differ in this regard.It is important to realize that all other scales on which the inversion does not provide much information need to be properly represented by the a priori flux distribution.Thus, the a priori fluxes need to be realistic on short spatial scales below about 200 km, on seasonal temporal scales and of course on hourly timescales that are not retrieved by the inversion.
The annual spatial flux distribution of the B1 and S1 cases was found to be quite similar, indicating that inflating the uncertainty by a factor of 1.5 (B1 case, see also Sect.2.2.1) or adding a bias component to compensate for the inflation (S1 case) lead to a similar flux constraint.This could be explained due to the long correlation length (566 km), which drastically reduces the effective number of degrees of freedom, forcing the fluxes to be smoothly corrected, regardless of the use of the bias component.
The true fluxes were used to validate the posterior flux estimates.In this synthetic experiment, both fluxes share the same spatial resolution (25 km), which makes the validation straightforward.In a real-data inversion, EC measurements will substitute the true fluxes, making the spatial scales not directly comparable.Despite the scale mismatch, Broquet et al. (2013) compared the posterior flux estimates against EC data with promising results, showing that posterior mismatches are in good agreement with the theoretical uncertainties.

Performance in observation space
The high RMSD reduction in combination with the high correlation values and the captured variability between posterior and true dry mole fractions in the synthetic experiment suggest a good performance of the inversion system to retrieve the true mixing ratios.Nevertheless, this is not surprising, as the atmospheric data are fitted by the inversion.Further-more, the forward and the inverse runs used identical transport, without any impact from imperfections in transport simulations.
The uncertainties in the flux space are statistically consistent with the model-model flux mismatch.However the reduced χ 2 r values obtained from the inversions were rather small (around 0.21).This indicates that overall conservative uncertainties were assumed, and the small χ 2 r values are a result from the assumed uncertainties in the observation space.Indeed, uncertainties in the observation space also include transport uncertainties; however, given that the same transport is used to create synthetic observations and to perform the inversion, there is no actual model-data mismatch related to transport uncertainties, and thus the assumed uncertainties are overestimated.In the current study we assumed a diagonal measurement error covariance matrix.Concerns might rise that the observational uncertainties are underestimated due to the absence of the error correlations.However, we do implicitly consider this transport errors might by correlated over time, and we do consider that via the data density function.Further, for the synthetic study the χ 2 r values prove to be a fair treatment of the observational uncertainties.

Conclusions
This technical note describes the setup and the implementation of prior uncertainties as derived from model-eddy covariance data comparisons into an atmospheric CO 2 inversion.The inversion system assimilates hourly dry air mole fractions from 16 ground stations to optimize 3-hourly NEE fluxes for the study year 2007.Two different error structures were introduced to describe the prior uncertainty by either inflating the error or by adding an additional degree of freedom allowing for a long-term bias.The need of this error inflation comes from the fact that the spatiotemporal modeldata error structure alone underestimates prior uncertainties typically assumed for inversion systems on continental and annual scales.In this study we evaluate the Jena inversion system by performing a synthetic experiment and expanding the evaluation to the retrieved fluxes, whilst only the observation space was evaluated in Trusilova et al. (2010).Further, we assess the impact when adding a bias term in the flux error structure.This study is a preparatory step to retrieving European biogenic fluxes using a data-driven error structure consistent with model-flux data mismatches, which is described in a follow-up study (Kountouris et al., 2018).
Significant flux corrections and error reductions were found for larger aggregated regions (i.e., domain-wide and countries), giving us confidence on the reliability of the results for a real-data inversion at least for aggregated scales up to the country level.We found a similar performance for both error structures.A more detailed analysis of the spatial and temporal scales, on which the inversion provides a significant gain in information on the distribution of fluxes, clearly con- firms that (a) fluxes on spatial scales much smaller than the spatial correlation length used for the a prior uncertainty cannot be retrieved; (b) the inversion performs best on around monthly temporal scales; and (c) especially the small spatial scales need to be realistically represented in the a priori fluxes.

Figure 1 .
Figure 1.Domain of the inversions (dashed rectangle).Locations of the atmospheric measurement stations are shown with blue marks.

Figure 2 .
Figure2.Monthly data coverage plot for the atmospheric stations used in the regional inversions.The left column shows the code name and the right columns show the station class and the assigned uncertainty in units of ppm."C" stands for continental sites near the surface, "T" for continental tall towers, "S" for stations near shore, "M" for mountain sites, "MU" for mountain sites with diurnal upslope winds and "UP" for urban pollutant.

Figure 3 .
Figure 3. Daily nighttime (23:00-04:00 UTC) averages for prior, true and posterior CO 2 dry mole fraction time series for the mountain site Schauinsland.Time starts at 1 January 2007.Note that due to the almost perfect fit, posterior and true time series overlap.

Figure 4 .
Figure 4. Taylor diagram for daily averaged modeled and measured time series (annual basis) of CO 2 dry mole fractions.Prior (black), true (green, the perfect match of modeled and true time series) and the different inversion cases (B1 blue; S1 red) are displayed.Different symbols denote different atmospheric stations.The normalized SD was calculated as the ratio of the SD of the modeled time series to the SD of observations.

Figure 5 .
Figure 5. Annual spatial distribution for the prior, true and posterior biogenic flux estimates for the two synthetic inversions S1 and B1 (a, b), and flux innovation defined as the difference of posterior minus prior (c).Fluxes are given in units of g C m −2 yr −1 .

Figure 6 .
Figure 6.Annual integrated influence for 2007 of the current atmospheric network.Footprint influence is presented on a logarithmic scale and units are in log 10 [ppm (µmol m −2 s −1 ) −1 ]

Figure 7 .
Figure 7. Monthly and annual carbon flux budget, integrated over the European domain.Note that both inversions share the same annual prior uncertainty but monthly uncertainties differ.Blue and red error bars denote the prior uncertainty for the B1 and S1 scenarios, respectively.

Figure 9 .
Figure 9. Overview of the model performance (S1 case) summarized in a Taylor diagram.Posterior and prior monthly-and country-scale aggregated biospheric fluxes are compared against the reference fluxes (true).Each line corresponds to a different country.The starting point of each arrow shows the prior and reference comparison and the ending point shows the posterior and reference comparison.Ideally the ending point should coincide with the green point, which represents the reference model.

Figure 10 .
Figure 10.Mean monthly NEE averaged over the 53 different eddy covariance site locations as reported in Kountouris et al. (2015).A priori (black), true (green) and posterior fluxes for scenarios B1 (blue) and S1 (red) are shown.Units are in g C m −2 day −1 .

Figure 11 .
Figure 11.(a) Mean spatial correlation of monthly fluxes with true fluxes as functions of the spatial flux aggregation scale for prior fluxes (grey) and for posterior fluxes from scenarios B1 (blue) and S1 (red).(b) Mean spatial correlation of fluxes with true fluxes at 2 • spatial resolution as functions of temporal flux aggregation scale for prior fluxes (grey) and for posterior fluxes from scenarios B1 (blue) and S1 (red).

Table 4 .
Performance of the two error structures expressed as the spatial RMSD of the optimized monthly and annual NEE fluxes compared to the truth for the whole domain in µmole m −2 s −1 .Temporal evolution of monthly NEE for selected European countries for the synthetic data inversion.