A biogenic CO 2 flux adjustment scheme for the mitigation of large-scale biases in global atmospheric CO 2 analyses and forecasts

Forecasting atmospheric CO2 daily at the global scale with a good accuracy like it is done for the weather is a challenging task. However, it is also one of the key areas of development to bridge the gaps between weather, air quality and climate models. The challenge stems from the fact that atmospheric CO2 is largely controlled by the CO2 fluxes at the surface, which are difficult to constrain with observations. In particular, the biogenic fluxes simulated by land surface models show skill in detecting synoptic and regional-scale disturbances up to sub-seasonal time-scales, but they are subject to large seasonal and annual budget errors at global scale, usually requiring a posteriori adjustment. This paper presents a scheme to diagnose and mitigate model errors associated with biogenic fluxes within an atmospheric CO2 forecasting system. The scheme is an adaptive scaling procedure referred to as a biogenic flux adjustment scheme (BFAS), and it can be applied automatically in real time throughout the forecast. The BFAS method generally improves the continental budget of CO2 fluxes in the model by combining information from three sources: (1) retrospective fluxes estimated by a global flux inversion system, (2) land-use information, (3) simulated fluxes from the model. The method is shown to produce enhanced skill in the daily CO2 10-day forecasts without requiring continuous manual intervention. Therefore, it is particularly suitable for near-real-time CO2 analysis and forecasting systems.


Introduction
Earth-observing strategies focusing on carbon cycle systematic monitoring from satellites, flask and in situ networks (Ciais et al., 2014;Denning et al., 2005) are leading to an increasing number of near-real-time observations available to systems such as those developed in the framework of the European Union Copernicus Atmosphere Monitoring Service (CAMS).CAMS uses the Numerical Weather Prediction (NWP) Integrated Forecasting system for Composition (C-IFS) of the European Centre for Medium range Weather Forecasts (ECMWF) to produce near-real-time global atmospheric composition analysis and forecasts, including CO 2 (Agustí-Panareda et al., 2014) along with other environmental and climate relevant tracers (Flemming et al., 2009;Morcrette et al., 2009;Massart et al., 2014).The purpose of the real-time CO 2 analysis/forecasting system is to provide timely products that can be used by the scientific community among other users.For example, those working on new instruments, field experiments, satellite retrieval products, regional models requiring boundary conditions, or planning flight campaigns.
The current C-IFS CO 2 analysis is produced by assimilating CO 2 data retrieved from GOSAT by the University of Bremen (Heymann et al., 2015), as well as all the meteorological data that is routinely assimilated in the operational meteorological analysis at ECMWF.Massart et al. (2015) have shown that the atmospheric data assimilation system alone cannot completely remove the biases in the background atmospheric CO 2 associated with the accumulation of errors in the CO 2 fluxes from the model.This happens because currently the CO 2 surface fluxes in the C-IFS data assimilation system cannot be constrained by observations.The model biases in atmospheric CO 2 also present a problem for the data assimilation system because its optimization relies on the assumption that both model and observations are unbiased.It is therefore imperative to remove any large biases before assimilating observations.In this paper, we present a method to reduce the atmospheric CO 2 model biases by adjusting the CO 2 surface fluxes in a near-real-time CO 2 analysis/forecasting system, such as the one used by C-IFS at ECMWF.
Many different methods already exists to adjust CO 2 fluxes by using observations of atmospheric CO 2 within flux inversion systems (Rödenbeck et al., 2003;Gurney et al., 2003;Peters et al., 2007).However, these are not all suitable for the C-IFS real-time monitoring system.Flux inversion systems adjust the fluxes by either inferring the model parameters in Carbon Cycle Data Assimilation Systems also known as CCDAS (Rayner et al., 2005;Scholze et al., 2007;Rayner et al., 2011), or the fluxes themselves (Houweling et al., 2015).CCDAS has the advantage of working in prognostic mode once the model parameters have been optimized.Nevertheless, it can also be prone to aliasing information to the wrong model parameter when the processes that contribute to the variability of atmospheric CO 2 are not properly represented in the model or missing altogether.Estimating directly the CO 2 fluxes does not rely on the accurate representation of complex/unknown processes in the CO 2 flux model, but the resulting optimized fluxes do not have predictive skill.Both approaches generally use long data assimilation windows of several weeks to years in order to be able to constrain the global mass of CO 2 by relying mainly on high quality in situ flask and continuous observations which are relatively sparse in time and space.This general requirement for long assimilation windows is incompatible with the current NWP framework (e.g. a 12 h window is currently used in the C-IFS).In addition to that, the CO 2 observations from flask and most in situ stations used by these flux inversion systems are not available in near-real time.
Considering all the aspects mentioned above, a biogenic flux adjustment scheme (hereafter called BFAS) suitable for the NWP framework is proposed which aims to combine the best characteristics of both flux inversion approaches.Namely, the mass constraint from the optimized fluxes is used to correct the biases of the modelled CO 2 fluxes while keeping the predictive skill of the modelled fluxes at synoptic scales.The main objective of BFAS is to reduce the large-scale biases of the background atmospheric CO 2 .This should improve the representation of the atmospheric CO 2 large-scale gradients, and thereby also lead to a better forecast of atmospheric CO 2 synoptic variability.
The details of the flux adjustment scheme are provided in Sect. 2. Section 3 describes the C-IFS experiments done to test the impact of BFAS on the atmospheric CO 2 forecast.From the experiments, different aspects of the flux adjustment can be monitored (i.e. the scaling factors and the resulting budget) as shown in Sect. 4. The resulting atmospheric CO 2 forecast fit to observations after applying BFAS is presented in Sect. 5.The potential use of BFAS for model development and the possibility of including BFAS in the data assimilation system are discussed in Sect.6.Finally, Sect.7 gives a summary of the flux adjustment achievements and possible developments for the future.

Methodology
Any atmospheric CO 2 analysis/forecast system requires a flux adjustment of some sort in order to constrain the budget of sources/sinks a the surface and avoid the growth of mean errors in the atmospheric background (Agustí-Panareda et al., 2014).The scientific question addressed in this paper is how to use the best information we have in nearreal time to adjust the fluxes in a way that reduces the bias of the atmospheric CO 2 in the model with the minimum deterioration of the synoptic skill to predict day-to-day variability.
Agustí-Panareda et al. ( 2014) documented the configuration of the CO 2 forecasting system and showed that the large biases in atmospheric CO 2 are consistent with errors associated with the budget of CO 2 surface fluxes, in particular the net ecosystem exchange (NEE) modelled by the CTESSEL carbon model (Boussetta et al., 2013) within the C-IFS.
There are three main reasons for modelling NEE fluxes online as opposed to using offline fluxes such as optimized fluxes from flux inversion systems directly in the model: (i) the coupling of CO 2 biogenic fluxes with the atmospheric model can lead to improvements in both the understanding of interactions between ecosystems and the evolution of CO 2 in the atmospheric boundary layer (Lu et al., 2001;Moreira et al., 2013) and the forecast skill of energy and water cycle fluxes in NWP models (Boussetta et al., 2013); (ii) the use of offline fluxes would entail a loss of information and the introduction of topographical inconsistencies when downscaling fluxes from low resolution (e.g.typically a few degrees in optimized fluxes) to high resolution (e.g.currently 9 km in ECMWF NWP model); (iii) the non-availability of these offline fluxes in near-real time implies the interannual vari-ability of the NEE fluxes (Schaefer et al., 2002) cannot be represented.
The challenge remains of how to reduce the large-scale biases associated with the modelled fluxes in real time.Because these biogenic fluxes are modelled online, a one-off scaling of the fluxes using a climatology of the annual global budget (Nassar et al., 2010;Chen et al., 2013) or re-scaling locally the NEE in order to get a better fit with the seasonal cycle (Messerschmidt et al., 2013;Keppel-Aleks et al., 2012) are not suitable methods, as we do not know the annual budget of the model in real-time.
Optimized fluxes from flux inversion systems constitute the best available estimate of the CO 2 fluxes given the observed variations of CO 2 in the atmosphere at global scales.Thus, they can provide a reference benchmark for the modelled fluxes.The large-scale biases in the CO 2 fluxes can be diagnosed by computing the budget (i.e.integrated) differences between modelled fluxes and optimized fluxes over continental and supra-synoptic spatial and temporal scales (≥ 1000 km, 10 days).Working with budgets over scales beyond the synoptic scale allows the detection of large-scale biases without interfering with the synoptic skill of the model.
It is important to note that there are uncertainties and limitations that should be considered when using optimized fluxes.Optimized fluxes are computed with flux inversion systems at low resolutions (∼ hundreds of km) compared to the NWP resolution used for the CO 2 forecasts (∼ tens of km), and they are most reliable at continental and suprasynoptic scales.Moreover, they have the limitation of not being available in near-real time, unlike the meteorological observations or CO 2 satellite retrievals (Massart et al., 2015).Because of that, a climatology of the optimized fluxes has to be used as a reference.
Finally, optimized fluxes only provide information on the total CO 2 flux because flux inversion systems are not able to attribute the CO 2 variability to the different processes controlling the fluxes, such as vegetation, anthropogenic sources and fires.Generally, from all these fluxes, the land CO 2 fluxes from vegetation and soils in models are associated with high uncertainty (Le Quéré et al., 2015).For this reason, the Global Carbon Project provides the CO 2 budget from land vegetation -also known as the land sink -as a residual to close the carbon budget (see www.globalcarbonproject.org/carbonbudget, Le Quéré et al., 2015).Following the land sink residual approach, the optimized NEE can be computed as the residual of optimized fluxes by subtracting the other prescribed fluxes.A set of 10day mean budgets of this residual NEE from optimized fluxes is then computed daily for different regions and vegetation types over a period of 10 years to build the NEE climatology that can be used as a reference.In order to account for the inter-annual variability of NEE, the reference climatology is also adjusted with an inter-annual variability factor obtained from the model.
The flux adjustment scheme essentially estimates the bias of the modelled NEE budget with respect to the reference NEE budget for each region and vegetation type as a scaling factor α: where f is the 10-day mean NEE budget computed daily over a specific vegetation type and region, f O is the reference budget based on the MACC-13R1 optimized fluxes (Chevallier et al., 2010), and f M is the budget of the modelled fluxes.Figure 1 shows how the BFAS scheme interacts with the model to produce the flux-corrected atmospheric CO 2 forecast.First of all, the uncorrected NEE fluxes from the model are retrieved.Then their budget is compared with the budget of the NEE climatology from the optimized fluxes adjusted with the NEE anomaly from the model.The scheme produces maps with scaling factors of the biogenic fluxes before the forecast run.Subsequently, these maps are then used to scale the forecast of NEE.There are three major building blocks required for the computation of these scaling factors: the computation of the NEE budget using temporal and spatial aggregation criteria (e.g. 10 days, vegetation types, different regions); the partition of the NEE adjustment into the two modelled ecosystem fluxes that make up the NEE flux: i.e. gross primary production (GPP) associated with photosynthesis and ecosystem respiration (R eco ) documented by Boussetta et al. (2013).
These different aspects are discussed in further detail below in Sects.2.1 to 2.4.

Computation of NEE budget
The biases of the NEE fluxes that we aim to correct are partly linked to model parameter errors that depend on vegetation type and to errors in the meteorological/vegetation state which are region-dependent (e.g.radiation, LAI, temperature and precipitation).In addition to that, the global optimized fluxes used as reference do not currently have a strong constraint from observations at small spatial and temporal scales due to the sparse observing network of atmospheric CO 2 .Therefore, the NEE biases are not diagnosed at the model grid-point scale, but as biases in the NEE budget over continental regions for different vegetation types and over a period of 10 days.The 10-day regional budget provides an indicator on the large-scale biases.Moreover, 10 days is a period that can be used in the current framework of the C-IFS global atmospheric CO 2 forecasting system.Figure 2 shows how the uncorrected NEE from the past forecasts can be combined to compute the 10-day mean budget before each new forecast.The 1-day forecasts initialized from the previous seven days are used together with the last 3-day forecast available in order to create a 10-day window around the initial date of the new forecast.This 10-day time window is slightly shifted to the past because otherwise forecasts longer than 3 days would be required to compute the budget, while errors in the meteorology affecting the fluxes grow with forecast lead time.Chevallier and Kelly (2002) found that forecast errors associated with the location of extra-tropical weather systems affecting the cloud cover and temperature gradientswhich in turn will affect the NEE errors -are very small at day 1.These errors continue to be small up to day 3, but they can grow rapidly with forecast lead time (see Haiden et al., 2015, for details on the IFS forecast error evaluation).The different regions have been selected according to latitudinal band characterized by seasonal cycle (Northern Hemisphere, tropics, and Southern Hemisphere), continental region, and vegetation type.
In the C-IFS the vegetation types follow the BATS classification (Dickinson et al., 1986), which is widely used in meteorological and climate models.The vegetation classification is designed to distinguish between roughness lengths for the computation of the momentum, heat and moisture transfer coefficients in the modelling of the fluxes from surface to atmosphere.However, the BATS vegetation types are not always suitable for the modelling of the CO 2 fluxes.For example, the interrupted forest type which constitutes around 25 % of the high vegetation cover encompasses many different types of vegetation, including tropical savanna and a combination of remnants of forest or open woods lands with field complexes.This could be an important source of error in some regions.For this reason, BFAS allows the introduction of new vegetation types for diagnosing the NEE biases.Tropical savanna, which covers large areas in the tropical region, has been added as a subtype of the interrupted forest vegetation type by using the Olson Global Ecosystem classification (Olson, 1994a, b, https://lta.cr.usgs.gov/glcc/globdoc2_0).
Figure 3 shows the distribution of the dominant vegetation types used in BFAS.Land cover maps from GLCC version 1 (https://lta.cr.usgs.gov/glcc)are used to compute the land cover of the dominant high and low vegetation types at each grid point.In BFAS, only one dominant vegetation type is used to classify each grid point, and this must cover more than 50 % of the grid box.Model grid points with less than 50 % vegetation cover are not used.The comparison of the modelled NEE with the optimized NEE fluxes is done by computing 10-day budgets for each of the 16 vegetation types (see Table 1) and 9 different regions (see Fig. 3).

Reference NEE budget
The residual NEE from optimized fluxes provides the reference for the flux adjustment scheme.Currently, there is no operational centre providing CO 2 optimized fluxes at global scale in near-real time.We have chosen to use the MACC optimized fluxes (Chevallier et al., 2010) which are delivered around September each year for the previous year.The MACC optimized CO 2 fluxes are regularly improved and their high quality has been recently shown by Kulawik et al. (2016).Chevallier (2013) provides an evaluation of the inverted CO 2 fluxes for 2010.
The computation of the residual is done by subtracting the prescribed fluxes used in the C-IFS CO 2 forecast over land from the total optimized flux.The prescribed CO 2 fluxes from biomass burning and anthropogenic emissions in the CO 2 forecast are not the same as the ones used as prior fluxes in the MACC flux inversion system.Not only they are from different sources, but they are also used at different resolutions.This means that there might be fires represented in one and not the other, or with different emission intensities, as it is the case for anthropogenic hotspots at high vs. low resolutions.Thus, in order to avoid the transfer of inconsistencies between the prescribed and prior fluxes into the NEE residual, the regions with very high anthropogenic emissions (larger than 3 × 10 6 g C m −2 s −1 ) and fires are filtered out.
A climatology of these reference NEE fluxes is created using the last 10 available years and it is updated every time a new year is available.Thus, allowing for slow decadal variations in the NEE reference.Figure 4 shows a comparison of the optimized flux budget in 2010 and its climatology for the crop vegetation type in North America.The inter-annual variability of the optimized flux budget is depicted by the standard deviation around the 10-year climatological mean value.The reference NEE climatology is then adjusted to account for the inter-annual variability of the land sink fluxes as follows: where f is the 10-day NEE budget for a specific region and vegetation type, f O is the reference budget, f Oclim and σ (f Oclim ) are the climatological mean and standard deviation of the optimized flux budget, respectively, from 2004 to 2013, and γ is the corresponding standardized anomaly of the NEE budget from the model with respect to the same period.γ can be positive or negative.It represents the interannual variability factor used to adjust the reference climatological NEE budget and it is given by where f M is the model NEE budget, f Mclim is the climatological mean budget from the model and σ (f Mclim ) is the standard deviation of the model NEE budget denoting the typical amplitude of its inter-annual variability for the same period as the climatology of the optimized flux budget (i.e.2004 to 2013).The γ inter-annual variability factor is multiplied by the standard deviation of the optimized residual NEE budgetrepresenting the typical amplitude of inter-annual variability -in order to offset the reference climatological NEE budget.In this way, the inter-annual variability of the reference NEE follows the inter-annual variability of the model NEE with the same anomaly sign, while keeping its amplitude constrained by the standard deviation of the optimized flux budget.Note that the use of this factor is optional.By setting it to zero, the model budget can be constrained by the optimized flux climatology.The rationale for applying this factor in the C-IFS system is based on the fact that inter-annual variability of the NEE budget is strongly linked to the inter-annual variability of climate variables such as precipitation and temperature (Schaefer et al., 2002).Since information on these climate variables is readily available in the C-IFS system, it is worth exploring its impact on the CO 2 forecast.A preliminary assessment of the impact of including the inter-annual variability factor was performed by comparing experiment with and without the factor.Results confirmed a small but positive impact (see Supplement).Details on the computation of this factor are given in the next section.

The inter-annual variability factor
The computation of the inter-annual variability factor γ requires a model climate consistent with the forecast (i.e.same meteorological analysis, same model version and same resolution).Producing a consistent model climate is not a trivial requirement, because both the operational model version and analysis system can change frequently with new updates and new observations, and high-resolution forecasts spanning a period of 10 years (i.e.2004 to 2013) are expensive.A feasible solution has been found where the standardized NEE anomaly from the model is computed using the operational ensemble prediction system (ENS) forecasts and hindcasts which are part of the ECMWF monthly forecasting system (Vitart et al., 2008;Vitart, 2013Vitart, , 2014)).Every Monday and Thursday the operational ENS is not only run for the actual date, but also for the same calendar day of the past 20 years.These hindcasts have the same resolution and model version as the ENS forecasts and they constitute a valuable data set used for the post-processing of the NWP forecasts from the medium-range (10 days) up to 1-month lead times (Hagedorn et al., 2012).The ensemble of forecasts is made of 5 members (10 members since 2015) using perturbed initial conditions (Lang et al., 2015) and stochastic physics in order to represent forecast uncertainty (Palmer et al., 2009).
As the hindcasts are not performed daily, it is not possible to aggregate consecutive 1-day forecasts into a 10-day period to compute a mean budget as shown in Fig. 2. In order to circumvent this, the mean budget is computed by averaging the 1-day forecast NEE from all the ensemble members available in the hindcasts.This is done for each year from 2004 to 2013 to preserve consistency with the NEE climatology from the optimized fluxes.The model climate f Mclim given by the 10-year mean budget and its typical inter-annual variability σ f Mclim can then be obtained by calculating the mean value and standard deviation, respectively, over that period.Similarly, the model budget f M is calculated from the NEE ensemble mean of the ENS forecast for the current date using the same number of ensemble members as the ENS hindcasts.The standardized anomaly γ is finally obtained by subtracting the 10-year mean budget from the current budget and dividing the anomaly by the standard deviation.Since the hindcasts are available every Monday and Thursday, γ is only updated twice a week.These updates are routinely monitored during the forecast (see Sect. 4).

Partition of NEE adjustment
The final stage in the flux adjustment is the attribution of the NEE correction to the different biogenic fluxes in the model.The residual NEE from optimized fluxes only provides information on the total flux from the land ecosystem exchange.While in land vegetation models, NEE is the combination of two opposing fluxes: gross primary production (GPP) and the ecosystem respiration (R eco ).Given that we have no information on whether the NEE error is associated with the GPP or the R eco fluxes, a strategy has to be defined in order to partition the NEE correction into GPP and R eco .The underlying strategy used here is to have the smallest flux adjustment possible.Namely, the scaling factors should be as close to 1 as possible.
The first step is to distinguish between the positive and negative values of the NEE scaling factor (α).A positive NEE scaling factor implies the budget of the NEE in the model has the correct sign but the wrong magnitude.In that case, the scaling of the flux will be smallest if the dominant component of NEE is scaled.That is to say, the flux correction will be applied to GPP during the growing season and to R eco during the senescence period.Whereas if the scaling factor is negative -i.e. the modelled NEE has the wrong sign -only the flux with smallest magnitude is corrected (GPP or R eco ) to ensure the scaling factor of the modelled fluxes is always positive.
The scaling factor α is then converted into a scaling factor for the dominant component of the NEE flux.If the magni- tude of GPP is larger than the magnitude of R eco , then the scaling factor for GPP and R eco are defined as follows: This partition of the flux adjustment is a modelling choice based on minimum flux adjustment criteria.Other solutions might be possible given additional information on either GPP or R eco budgets.
The α GPP and α R eco factors are computed for each vegetation type and region and then re-mapped as two-dimensional fields using the dominant vegetation type map in Fig. 3.The resulting maps for α GPP and α R eco are subsequently passed to the carbon module in the land surface model in order to scale GPP and R eco .

CO 2 forecast simulations
Several simulations have been performed in order to test the impact of BFAS on the atmospheric CO 2 forecasts (see Table 2).All the simulations use the C-IFS CO 2 forecasting system (Agustí-Panareda et al., 2014) based on the IFS model (www.ecmwf.int/en/forecasts/documentation-and-support).They all share the same transport.The only difference between them is the CO 2 surface fluxes they use as described in Table 2.The impact of BFAS is assessed by comparing the simulations using modelled NEE fluxes without BFAS (CTRL) and with BFAS (BFAS).The BFAS simulation is also compared with the simulations using optimized fluxes (OPT) and a climatology of optimized fluxes (OPT-CLIM).Both OPT and OPT-CLIM simulations constitute a benchmark because they are driven by the reference fluxes used in BFAS.From these experiments we expect to see the forecast from BFAS to be closer to the benchmark forecasts (in particular OPT-CLIM) than to the CTRL forecast.
The forecasts are performed using the cyclic configuration described by Agustí-Panareda et al. ( 2014) with a spectral resolution of TL255, equivalent to around 80 km in the horizontal, and 60 vertical levels.They are initialized daily at 00:00 UTC with ECMWF operational analysis, while the atmospheric CO 2 is cycled from one forecast to the next, as in a free run.The simulations span the period from 1 January to 31 December 2010.This period has been selected because of the large variety of observations available to evaluate the BFAS performance on the atmospheric CO 2 forecasts.The CO 2 initial conditions on 1 January 2010 are from the atmospheric CO 2 analysis using GOSAT CO 2 retrievals (Heymann et al., 2015).The flux adjustment is monitored by plotting time series of the flux scaling factors for each vegetation type and region.For example, Fig. 5 shows the GPP and R eco scaling factors for the crop vegetation type which is present in all regions.The values range from 0.5 to 6.These coefficients are computed daily before the beginning of each forecast and they are kept constant throughout the forecast.Generally, there is a slow variation of the coefficients from one day to the next.This is expected since the coefficients are obtained from large-scale budgets computed over a 10-day period.The map of the GPP and R eco scaling factors applied to adjust the modelled biogenic fluxes on 15 March 2010 is shown in Fig. 6.These maps can be very useful to monitor the flux adjustment because they can provide alerts on the regions with largest biases to model developers.
The effect of the flux adjustment on the NEE budget is shown in Fig. 7.The adjusted biogenic fluxes should always lead to an NEE budget close to the budget of the optimized NEE climatology.However, the fit will also depend on the degree of inter-annual variability of the model determined by parameter γ in Eq. (3). Figure 8 displays the monitoring of γ given by the standardized NEE anomaly of the model.Positive values mean the CO 2 source is larger than normal and/or the CO 2 sink is lower than normal with respect to the 10year mean budget of the model, covering the same period as the reference climatology.Conversely, negative values correspond to a smaller than normal source and/or larger than normal sink.When γ is larger than 1, the model anomaly is larger than 1σ .This indicates the possible occurrence of an extreme event.Prolonged extreme events -such as droughts -would have an effect on the NEE budget and the computation of the biogenic flux adjustment.

Impact of the flux adjustment
The impact of BFAS is shown by comparing the atmospheric CO 2 from the BFAS forecast to the CTRL forecast, and to the benchmark forecasts with optimized fluxes (OPT and OPT-CLIM) at several observing sites.Four sites from the NOAA/ESRL atmospheric baseline observatories (www.esrl.noaa.gov/gmd/obop,Thoning et al., 2012) are used to evaluate the reduction of the large-scale biases in the wellmixed background air.In addition, four Total Carbon Column Observing Network stations (GGG2014 TCCON data, Wunch et al., 2011, see Table 3 and www.tccon.caltech.edu)are also used to assess the impact on the atmospheric CO 2 column-average dry molar fraction.Finally, three continental sites from the NOAA/ESRL tall tower network (www.esrl.noaa.gov/gmd/ccgg/towers,Andrews et al., 2014) are used to investigate the impact of BFAS on the synoptic skill of the forecasts.The results are grouped into the impacts on bias  reduction and synoptic skill in the following two sections.A comprehensive evaluation of the uncertainty reduction in the BFAS simulation based on all the ObsPack (2015) (Masarie et al., 2014) in situ flask and continuous observations, as well as the NOAA/ESRL aircraft vertical profiles (Sweeney et al., 2015) is also provided in the Supplement.

Biases in atmospheric CO 2
Figure 9 demonstrates that BFAS is very effective at reducing the atmospheric CO 2 biases in the background air at all the NOAA/ESRL continuous baseline stations.The biases in the CTRL forecast range from −1.9 to −4.5 ppm; whereas, the BFAS forecast has biases of −0.5 ppm or less over the whole year.These values are close to the annual biases of the OPT and OPT-CLIM experiments ranging between −0.4 and 0.5 ppm.The monthly biases in BFAS can be larger than its annual biases.For example, there is a bias of up to −1 ppm from March to September in the Southern Hemisphere (Fig. 9c, d).This bias is thought to originate in the tropical regions and transported to the Southern Hemisphere as shown by a preliminary comparison with IASI CO 2 (Crevoisier et al., 2009) (not shown here).The bias starts to grow at the end of the growing season during summer time.This is also the case for the high latitude station at Barrow, where there is a negative bias of a few ppm from the last week of July to the end of September as shown in Fig. 9a.In summary, BFAS is not able to completely remove the negative model bias at the end of the growing season.In the Northern Hemisphere at the end of winter and throughout spring (from March to May) there is a positive model bias, i.e. the atmospheric CO 2 is overestimated in the model.Although the OPT and OPT-CLIM simulations also have a slight posi- tive bias in winter, this positive bias is enhanced in the BFAS simulation.
At the TCCON sites (Fig. 10), the atmospheric CO 2 column-average dry molar fraction also shows the same large bias reduction in BFAS with respect to CTRL.The magnitude of the BFAS annual biases in the atmospheric column is generally less than 1 ppm, slightly higher than the OPT and OPT-CLIM biases (less than 0.5 ppm), but much lower than the CTRL biases (from 1.5 to 3.3 ppm).The results at the TC-CON sites are consistent with those from the NOAA/ESRL baseline sites.Namely, in the Northern Hemisphere there is a growing overestimation of the atmospheric CO 2 at the end of winter (around March).While at the end of the growing season in both the Northern Hemisphere and the Southern Hemisphere (August and March, respectively) there is a growing negative bias, i.e. an overestimation of the sink.One hypothesis that could explain why BFAS is not able to achieve as small a bias as the forecast with optimized fluxes lies in the fact that the optimized NEE used as a reference in BFAS is computed as a residual after removing the effect of fires and anthropogenic fluxes.Inconsistencies in the fire and anthropogenic emissions used by the optimized fluxes and the model will lead to errors in the optimized residual NEE.These inconsistencies are mainly associated with the use of different resolutions.Further investigation is required to address this issue.

Synoptic variability of atmospheric CO 2
The C-IFS CO 2 forecast has been shown to have high skill in simulating the synoptic variability of atmospheric CO 2 (see Agustí-Panareda et al., 2014), except during the spring months, coinciding with an early start of the CO 2 drawdown period in the model.For this reason, we have examined the impact of BFAS on the synoptic variability of daily mean atmospheric CO 2 at three continental NOAA/ESRL tower sites in March.Over this period, the day-to-day variability of atmospheric CO 2 at those sites is associated with the advection of atmospheric CO 2 by baroclinic synoptic weather systems as they impinge on the large-scale continental gradient of atmospheric CO 2 .Table 4 clearly demonstrates that with BFAS the synoptic forecast skill is greatly improved at all sites, with correlation coefficients between simulated and observed atmospheric CO 2 exceeding 0.8.The improvement is particularly striking at Park Falls (Wisconsin, USA) and West Branch (Iowa, USA) at the centre of North America, where the correlation coefficients in CTRL are very low (i.e.below 0.5).The OPT and OPT-CLIM forecasts have generally high correlation coefficients, comparable to BFAS.Only at the level closest to the surface, the values are slightly lower than BFAS.This can be explained by the fact that the MACC-13R1 optimized fluxes do not comprise synoptic variability.Thus, when the synoptic variability of the fluxes contributes to the atmospheric CO 2 variability, the correlation coefficients are smaller.
The positive impact of BFAS on the CO 2 synoptic variability is illustrated in Fig. 11.The large synoptic variabil- ).In the case study here, the CO 2 -rich air is located to the south of the observing stations, as shown by the distribution of the monthly mean atmospheric CO 2 depicting the large-scale gradients across the continent at the level corresponding to the height of the tall towers (Fig. 12a and b).In the CTRL forecast, there is no monthly mean gradient south of the stations (Fig. 12c).This explains why without BFAS the synoptic variability is very small and largely underestimated throughout March.While in BFAS the gradient south of the observing stations is very pronounced (Fig. 12d), following a similar pattern to OPT and OPT-CLIM.There are still some differences between the three simulations.OPT-CLIM results in stronger gradients than OPT and BFAS enhances the gradient even further, leading to a slight overestimation of the synoptic variability.These differences in the patterns of the atmospheric CO 2 are directly linked to the differences in the CO 2 surface fluxes (Fig. 13).As expected, the flux adjustment from BFAS results in a flux pattern similar to OPT-CLIM and OPT, with a stronger source to the south of the observing stations.Whereas in CTRL there is a large sink area south of the observing stations, in the region of the Gulf of Mexico, consistent with the CTESSEL early growing season (Balzarolo et al., 2014).

Discussion
All the results from the BFAS experiments indicate that BFAS is highly beneficial to the C-IFS CO 2 forecasting system, both in terms of reducing the atmospheric CO 2 biases and improving the synoptic skill of the model.As shown in Sect.2, the scheme is simple and it is easy to implement and run.Because BFAS essentially works as a layer on top of the model, it can adapt to model changes with great flexibility.For all these reasons, BFAS is now part of the operational global C-IFS analysis and forecasting system.Notwithstanding all the advantages of BFAS listed above, there are also caveats that need to be considered, further tested, and addressed.A discussion of the current limitations of BFAS is provided in this section, together with the potential use of BFAS for model development, data assimilation purposes, and the implications for users.

Current limitations in BFAS
Optimized fluxes have uncertainties of their own and represent the large-scale variability of the CO 2 surface fluxes on supra-synoptic time-scales.They only estimate the total flux and the NEE residual approach can transfer biases from other fluxes into the NEE.The use of a climatology also precludes the correction of the inter-annual variability in the model.
The aggregation criteria of budget errors can be very challenging because the error can originate from different aspects of the model.Clearly, errors in model parameters associated with vegetation type are a good candidate.However, in the future errors in climate forcing, errors in LAI, missing processes and other potential sources of error should also be considered.
The partition of the NEE flux adjustment into the modelled biogenic fluxes (GPP and R eco ) is currently ad hoc, leading to the transfer of errors from GPP to R eco and vice-versa.This problem could be addressed by using other independent data sets of GPP and R eco (e.g.Jung et al., 2011) that contain additional information on how to partition the NEE adjustment.

BFAS for model development
BFAS can run in both online and offline modes.Thus, it can provide a tool to diagnose regions that contribute to the errors in the global budget resulting in large-scale errors of atmospheric CO 2 .The maps of biogenic flux scaling factors can be used to compute maps of flux adjustment (e.g.adjusted NEE -original NEE) which can then be used to diagnose model errors.The synthesis of the mean adjustments into monthly model biases for different vegetation types can then guide the effort to develop the carbon model further.For example, in regions where the bias is consistent between different months, the corrected NEE could be used to re-tune model parameters such as the reference ecosystem respiration or the mesophyll conductance, previously optimized by Boussetta et al. (2013) using a subset of FLUXNET data.Specific vegetation types can be identified where model improvements could be achieved by using information from BFAS.For instance, crops have the same large R eco scaling (> 1.5) over all the Northern Hemisphere regions during winter months when the ecosystem respiration is the dominant component of NEE.This underestimation in the ecosystem respiration can be addressed by modifying the value of the reference respiration parameter used for crops.In this case, the same procedure used by Boussetta et al. (2013) could be applied to optimize the specific model parameter using the BFAS adjusted fluxes as pseudo-observations together with the FLUXNET data.
Further information on error sources in fluxes can be obtained by comparing the corrected fluxes with the eddy covariance observations available in near-real time from the Integrated Observation System (ICOS) Ecosystem Thematic Centre (ETC, http://www.europe-fluxdata.eu).For example, preliminary comparisons have shown that there are large dif- ferences in the model-observation fit between needle leaf evergreen (pine) trees in the boreal and Mediterranean regions.This is consistent with results from Balzarolo et al. (2014), and it highlights the need for a new sub-classification of the evergreen needle leaf forests in regions with Mediterranean climate.

BFAS in the data assimilation framework
Currently, BFAS is only designed to be used as a bias correction computed before each forecast by using a reference data set based on optimized fluxes.In the future, BFAS could be adapted to work within a data assimilation (DA) framework in the C-IFS.To start with, the specification of uncertainties associated with both the reference data set and the model fluxes and the covariance of those uncertainties would allow a more optimal estimation of the flux adjustment.These uncertainties can be obtained from the flux inversion systems for the optimized fluxes and from the ECMWF ENS forecasts for the model fluxes.
Including BFAS in the C-IFS DA framework needs further exploration.The C-IFS uses a short time window (currently 12 h) to assimilate meteorological observations from very dense observing networks.With the short time window it is not possible to properly constrain the slowly varying global mass of the long-lived greenhouse gases due to the sparseness of their observing system.For instance, the current GOSAT and OCO-2 CO 2 observations do not cover high latitudes in winter.However, if we combined the assimilation of optimized fluxes (which already contain the global mass constraint) with observations linked to local fluxes (e.g.solar-induced chlorophyll fluorescence products from satellites, NEE eddy covariance observations and in situ atmospheric CO 2 observations) it might be possible to obtain an optimal estimate of more local scaling factors, while still respecting the global mass constraint.The possibility of optimizing the scaling factors in the DA system within the weak constraint framework (Trémolet, 2006(Trémolet, , 2007) also needs to be explored in the future.

Aspects to be considered by users
The implementation of BFAS is straightforward.Therefore, it could be easily used by other models.The only requirements are: (i) a reference budget which can be obtained from a climatology of optimized fluxes (e.g. the MACC product can be easily obtained from www-lscedods.cea.fr/invsat/PYVAR14_MACC/V2/Fluxes/3Hourly and it is well documented); (ii) past 10-day NEE simulated by the forward model; (iii) the NEE anomaly of the forward model with respect to its climate based on a 10-year simulation.The use of the NEE anomaly is optional, as its impact is relatively small (see Supplement).
The underlying motivation of BFAS is to improve the CO 2 analysis and forecast for users (e.g.those working on flux inversion systems, planning field experiments, or requiring boundary conditions for regional models).For this reason, it is paramount to provide information on all the input data going into BFAS.These are primarily continentalscale climatological budgets from modelled NEE and optimized fluxes.There is also some input from the anthropogenic emissions and the biomass burning emissions to extract the NEE as a residual from the optimized fluxes.The documentation of the specific components used in the C-IFS BFAS system and their uncertainties can be found in Boussetta et al. (2013), Chevallier et al. (2010), Chevallier (2015), Janssens-Maenhout et al. ( 2012) and Kaiser et al. (2012).The input data streams used in BFAS can be obtained from http://copernicus-support.ecmwf.intfor C-IFS NEE and GFAS biomass burning fluxes; from the EDGAR database http://edgar.jrc.ec.europa.eufor the anthropogenic fluxes; and from www-lscedods.cea.fr/invsat/PYVAR14_MACC/V2/Fluxes/3Hourly for the MACC optimized fluxes.
Since the BFAS product contains information from the optimized fluxes, users should be aware that the optimized fluxes assimilated most available background air-sample monitoring sites (listed in the supplement of Chevallier,   acp-15-11133-2015-supplement.pdf).A specification of the overall uncertainty associated with the BFAS simulation and the resulting reduction with respect to the control simulation is given in the Supplement.

Summary
This paper addresses the challenge of designing an online bias correction for an atmospheric CO 2 analysis/forecasting system.The overarching aim is to deliver an atmospheric CO 2 analysis and forecast that can be useful to the scientific community, e.g.working on data assimilation of atmospheric CO 2 observations, the development of the CO 2 observing system and providing boundary conditions for CO 2 regional modelling.Tuning model parameters and/or re-scaling fluxes offline are not sufficient to guarantee a bias reduction in the system.Thus, an online adaptive system is required because errors in the meteorology can evolve as a result of regular operational NWP model upgrades and these affect the NEE budget in the model.This is achieved in the new biogenic flux adjustment scheme (BFAS) by a simple scaling of the 10-day NEE budgets for different vegetation types and regions using a climatology of the MACC optimized fluxes (Chevallier et al., 2010) as a reference, adjusted to preserve the model inter-annual variability.This paper shows that BFAS has a positive impact on the atmospheric CO 2 forecast by greatly reducing the atmospheric CO 2 biases in background air and improving the synoptic variability in continental regions affected by ecosystem fluxes.The improvement in the synoptic skill of the forecast is associated with underlying changes in the large-scale gradient of the NEE fluxes where optimized fluxes provide information.
BFAS has been recently implemented in the C-IFS operational CO 2 forecast and analysis system, because of its simplicity, adaptability to model changes and beneficial impact.In this paper, the C-IFS model is just providing an example on how this method can be applied efficiently in an operational forecasting system.Other models could easily adopt such a system as there are only a few components required for its implementation (see Sect. 6.4).
As a diagnostic tool, BFAS has also the potential to provide feedback for model development.The use of BFAS in the data assimilation framework will be explored in the future.

Data availability
The C-IFS source code is integrated into ECWMF's IFS code, which is only available subject to a licence agreement with ECMWF.ECMWF member-state weather services and their approved partners will get access granted.The IFS code without modules for assimilation and chemistry can be obtained for educational and academic purposes as part of the openIFS release (https:// software.ecmwf.int/wiki/display/OIFS/OpenIFS+Home).A detailed documentation of the IFS code is available from https://software.ecmwf.int/wiki/display/IFS/CY40R1+Official+IFS+Documentation.The output from C-IFS can be requested via http://copernicus-support.ecmwf.int.Information on how to access the different data streams used in BFAS is provided in Sect.6.4.
The Supplement related to this article is available online at doi:10.5194/acp-16-10399-2016-supplement.

-
a reference NEE data set used to diagnose the model biases (e.g.optimized fluxes from global flux inversion systems such as the MACC-13R1 data set from Chevallier et al., 2010);

Figure 1 .
Figure 1.Schematic showing how BFAS fits in the atmospheric CO 2 forecasting system.BFAS is called before each forecast to compute the scaling factors for the model NEE (i.e.GPP + R eco ) based on the past archived forecasts.The maps of the scaling factors are then passed to the model which applies the adjustment to the output biogenic CO 2 fluxes from the land surface model.After combining the adjusted NEE fields with the other prescribed CO 2 fluxes, the resulting bias corrected fluxes are passed to the transport model to produce the atmospheric CO 2 forecast.

Figure 2 .
Figure 2. Schematic to illustrate how the 10-day NEE budget from the model is computed in BFAS for the forecast at day D by retrieving the past forecasts of accumulated NEE.Note that the retrieved NEE (computed by adding GPP and R eco ) has not been corrected by BFAS.The computation uses a set of 7 previous 1-day forecasts (initialized at D −8, D −7, D −6, . . .until D −2) together with the latest 3-day forecast from the previous day (i.e.D − 1) as shown by the blue boxes.

Figure 3 .
Figure 3. Dominant vegetation types based on the BATS classification used in the C-IFS and extended to include the tropical savanna subtype (in purple, as defined by the Olson (1994a) classification) within the interrupted forest type (in light blue).The vegetation type codes are described in Table1.The nine regions used in the computation of the NEE budget are delimited by the black lines.

Figure 4 .
Figure 4. Time series of 10-day mean NEE budget (GtC day −1 ) associated with the crop vegetation type in North America from the MACC-13R1 optimized flux data set in 2010 (red line) compared to its climatology (2004-2013) (yellow line).The yellow shading represents the standard deviation of the optimized flux budget (for the same period) used to compute the inter-annual variability adjustment applied to the reference climatology.Positive/negative values correspond to a source/sink of CO 2 .

Figure 5 .
Figure 5.Time series of GPP and R eco flux scaling factors in blue and red lines, respectively, for the crop vegetation type in 2010 in the different regions (see map in Fig. 3 depicting the extent of the crops within each region).

Figure 6 .
Figure 6.Map of scaling factors for (a) GPP and (b) R eco on 15 March 2010.

Figure 7 .
Figure 7. Time series of GPP (in blue), R eco (in red) and NEE (in green) daily budget (GtC day −1 ) before and after the flux adjustment (see dashed lines and solid lines, respectively) for crops in 2010 in the different regions.The reference budget provided by the climatology of MACC-13R1 optimized fluxes (2004-2013) and the MACC-13R1 optimized fluxes for 2010 are depicted by the black and magenta lines, respectively.Positive/negative values correspond to a source/sink of CO 2 .

Figure 8 .
Figure 8.Time series of the standardized anomaly of the modelled NEE budget (γ in Eq. 3) for crops in 2010 in the different regions.Positive values indicate larger/smaller CO 2 sources/sinks than normal based on the mean climatological budget; whereas negative values correspond to smaller/larger CO 2 sources/sinks than normal.

Figure 10 .
Figure 10.Daily mean atmospheric CO 2 column-average dry molar fraction (ppm) observed at four TCCON stations (see Table 3) as shown by the black circles, and simulated by the different forecast experiments: BFAS (cyan), CTRL (red), OPT (green) and OPT-CLIM (blue).See Table 2 for a description of the different experiments.The mean (δ) and standard deviation (σ ) of the model errors are shown at the top of each panel.

Figure 12 .
Figure 12.Monthly mean atmospheric CO 2 dry molar fraction (ppm) at the model level approximately corresponding to the highest sampling height of the Park Falls and West Branch NOAA/ESRL tall towers (see black triangles) in March 2010 from (a) OPT-CLIM, (b) OPT, (c) CTRL and (d) BFAS experiments (see Table2for a description of the different experiments).

Figure 13 .
Figure 13.Monthly mean total CO 2 flux (µmol m −2 s −1 ) in March 2010 from (a) OPT-CLIM, (b) OPT, (c) CTRL and (d) BFAS experiments (see Table 2 for a description of the different experiments).The black triangles depict the location of the NOAA/ESRL tall towers plotted in Fig. 11.

Table 1 .
Percentage of land grid points at model resolution TL255 (∼ 80 km) for each dominant vegetation type, i.e. more than half of the grid point is covered by that vegetation type.A land grid point is defined by a land sea mask value greater than 0.5.

Table 3 .
List of TCCON stations used in Fig.10ordered by latitude from north to south.

Table 4 .
(Keppel-Aleks et al., 2012 different forecast (FC) experiments (see Table2) with observations at three NOAA/ESRL tall towers for daily mean dry molar fraction of atmospheric CO 2 in March 2010.The dash symbol means the correlation is not significant.bythe advection of CO 2 -rich anomalies (with up to 10 ppm amplitude) as shown by the CO 2 peaks on 10-12 March at Park Falls, and 8-9, 12-13 and 16-17 March at West Branch.These CO 2 anomalies originate from the advection across the large-scale continental gradients of atmospheric CO 2 which ultimately reflect the largescale distribution of CO 2 surface fluxes(Keppel-Aleks et al., 2012