Interactive comment on “How much information do extinction and backscattering measurements contain about the chemical composition of atmospheric aerosol?”

Summary The authors consider the case of assimilation of remote-sensing data (speciﬁcally aerosol extinction and backscattering coefﬁcients) applied to aerosols ﬁelds within a chemical transport model. They describe how an additional term can be added to the 3D-var cost-function so that the assimilation adjusts only those components (in a transformed space) for which the observations provide information. The additional term relies on the singular value decomposition of the scaled observation operator. In this way, the assimilation automates the choice of control variables in an otherwise highly


Introduction
Atmospheric aerosols have a substantial, yet highly uncertain impact on climate, they can cause respiratory health problems, degrade visibility, and even compromise air-traffic safety.The physical and chemical properties of aerosols play a key role in understanding these effects.The aerosol properties are determined by a complex interplay of different chemical, microphysical, and meteorological processes.These processes are investigated in environmental modelling by use of chemical transport models (CTMs).However, modelling aerosol processes is plagued by substantial biases and errors (McKeen et al., 2007).It is, therefore, fundamentally important to evaluate and constrain CTMs by use of measurements.
Measurements from satellite instruments provide consistent long-term data sets with global coverage.However, it is notoriously difficult to compare measured radiances to modelled aerosol concentrations.An alternative to using radiances is to make use of satellite retrieval products.For instance, one of the products of the CALIPSO lidar instrument (Cloud-Aerosol Lidar and Infrared Pathfinder Satellite Observations) is a rough classification of the aerosol types (i.e.
Published by Copernicus Publications on behalf of the European Geosciences Union.
dust, smoke, clean/polluted continental, and clean/polluted marine).This retrieval product is based on lidar depolarization measurements (Omar et al., 2009).For the evaluation of aerosol transport models this provides us with a qualitative check for the chemical composition of aerosols.However, this is of limited practical use, since what we really need is quantitative information on the particles' chemical composition (which can be size-dependent).The most popular approach in evaluating and constraining aerosol transport models is the use of retrieved optical properties, such as aerosol optical depth, or extinction and backscattering coefficients.Yet another idea is to provide the particles' refractive index as a retrieval product (e.g.Müller et al., 1999;Veselovskii et al., 2002).However, the use of such retrieval products still leaves us with the challenge of solving an ill-posed inverse problem, namely, of determining the particles' chemical composition from their retrieved optical or dielectric properties.
A systematic class of statistical methods for solving this inverse problem is known as data assimilation.Recent studies have applied data assimilation to aerosol models with varying degrees of sophistication, ranging from simple dust models (Khade et al., 2013) and mass transport models (Zhang et al., 2014) to microphysical aerosol models based on modal (Rubin and Collins, 2014) or sectional descriptions (Sandu et al., 2005;Saide et al., 2013) of the aerosol size distribution.The assimilation techniques that have been used comprise variational methods, such as 2-D (Zhang et al., 2014), 3-D (Kahnert, 2008;Liu et al., 2011), and 4-D variational methods (Benedetti et al., 2009), as well as ensemble approaches (Sekiyama et al., 2010).Assimilation of satellite products for trace gases is relatively straightforward, since observed and modelled trace gas concentrations are almost directly comparable.However, aerosol optical properties observed from satellites are not directly comparable to the modelled size distribution and chemical composition of the aerosols.Solving this problem amounts to regularizing a severely under-constrained inverse problem.Previous aerosol assimilation attempts have been mainly based on educated guesses about the information content of the observations.For instance, there have been studies on the assimilation of aerosol optical depth (AOD) in which all chemical aerosol components in all size classes and at all model layers were used as independent control variables (Liu et al., 2011).This approach largely disregards the problems involved in inverse modelling.By contrast, it has been proposed to only allow for the total aerosol mass concentration to be corrected by data assimilation of AOD (Benedetti et al., 2009;Wang et al., 2014).This is a more prudent approach based on the plausible assumption that a single optical variable only contains enough information to control a single model variable.There have also been intermediate approaches in which the total aerosol mass per size bin have been used as control variables (Saide et al., 2013).
In all such approaches the choice of control variables is based on ad hoc assumptions.Numerical assimilation experi-ments by Kahnert (2009) suggest that observations of several aerosol optical properties at multiple wavelengths may allow us to constrain more than just the total mass concentration, but certainly not all aerosol parameters.However, it is still an unsolved mystery how much information a given set of observations actually contains about the size distribution and chemical composition of aerosols, and exactly which model variables are related to the observed signals, and which ones are related to noise.Thus a prerequisite for assimilating remote sensing observations into aerosol transport models is to thoroughly understand the information content of the observations as well as the relation between the model variables and the signal degrees of freedom.
In numerical weather prediction (NWP) modelling, several studies have discussed the information content of satellite observations for meteorological variables.For instance, Joiner and da Silva (1998) applied a singular-value decomposition (SVD) approach in order to reduce the effect of prior information in the analysis, so that the retrieval and forecast errors can be assumed to be uncorrelated.Rabier et al. (2002) considered assimilation of IR sounders, which typically provide a large number of different channels.They applied methods of information and retrieval theory in order to decide which channels contain most information about the vertical variation of temperature and humidity.Cardinali et al. (2004) employed the influence matrix to compute diagnostics of the impact of observations in a global NWP data assimilation system.Johnson et al. (2005a, b) investigated filtering and interpolation aspects in a 4DVAR assimilation system by use of an SVD approach.They also used Tikhonov regularization theory to optimize the signal-to-noise regularization parameter in order to maximize the information that can be extracted from observations.Xu (2006) compared different metrics, namely, the relative entropy and the Shannon-entropy difference, to measure information contents of radar observations assimilated into a coupled atmosphere-ocean model.Bocquet (2009) used methods of information theory to address the question how to determine an optimum spatial resolution of the discretized space of control variables in geophysical data assimilation.Burton et al. (2016) have recently investigated the information content of "3β + 2α" lidar measurements, i.e. observations of backscattering at three wavelengths and extinction at two wavelengths, where the information content was analysed with regard to the refractive index and number distribution of the aerosol particles.Veselovskii et al. (2004Veselovskii et al. ( , 2005) ) have performed similar analyses of the information content of multiwavelength Raman lidar measurements with regard to the complex refractive index and the effective radius of the aerosol particles.As mentioned earlier, the refractive index is a very useful retrieval product of remote sensing observations.However, from the point of view of chemical transport modelling, the main quantities of interest are the concentrations of the different chemical species of which the aerosol particles are composed.Although the chemical composition determines the refractive index, the inversion of this relationship is still under-determined, hence an ill-posed problem.In the present paper, we want to investigate the inverse problem that goes all the way from optical properties to the chemical composition of particles.
The two main goals of this paper are (i) to apply a systematic method for analysing the information content of aerosol optical properties with regard to the particles' chemical composition, and (ii) to test an algorithm for making an automatic choice of control variables in chemical data assimilation such that all control variables are signal related, while the noise-related variables remain unchanged by the assimilation procedure.The main hypothesis is that by constraining the data assimilation algorithm to acting on the signal-related variables only, the output will be less noisy than in an unconstrained assimilation.The focus of our study will be on spectral observations of extinction and backscattering coefficients, which can be retrieved from lidar observations. 1We will not restrict this analysis to any fixed choice of wavelengths, such as 3β + 2α.Instead, we will investigate the information content for varying combinations of the three main wavelengths of the commonly used neodymium-doped yttrium aluminium garnet (Nd:YAG) laser.However, it should be mentioned that extinction measurements at the lowest harmonic of 1064 nm can be difficult and plagued by high errors; in practice, this will affect the observation error, resulting in a low information content of this particular measurement.
The paper is organized as follows.Section 2 gives a rather concise introduction of the modelling tools and of the numerical approach employed to studying the information content of extinction and backscattering observations.Section 3 presents the main results of this study, and Sect. 4 offers concluding remarks.To make this paper self-contained, we include an appendix that gives a brief introduction to some essential concepts of data assimilation, and a detailed explanation of the methods we used for quantifying the information content of aerosol optical observables.

Methods
This study consists of two parts.In the first part we quantify the information content of extinction and backscattering 1 In addition to lidar measurements from ground-based and aircraft-carried instruments (e.g.Burton et al., 2015), there are currently two space-borne lidar instruments in orbit.The CALIOP instrument on board the CALIPSO satellite has been launched in April 2006; it has three receiver channels -one at 1064 nm and two channels at 532 nm to measure orthogonally polarized components.The CATS instrument on board the International Space Station has been operational since January 2015.It measures backscattering at 355, 532, and 1064 nm, where the latter two have two orthogonal polarization channels.It is also capable of performing highspectral-resolution measurements at 532 nm.A third instrument is planned to be launched in 2018 (ATLID on board EarthCARE).coefficients at multiple wavelengths.In the second part we perform a numerical test to investigate to what extent the concentrations of different chemical aerosol components can be constrained by observations of extinction and backscattering coefficients.The modelling tools required for this study are (i) a chemical transport model, (ii) an aerosol optics model, and (iii) a data assimilation system.

Multiple scale Atmospheric Transport and
CHemistry modelling system (MATCH) We employ the chemical transport model MATCH, which is an off-line Eulerian CTM with flexible model domain.It has been previously used from regional to hemispheric scales.
Here we use a model version that contains a photochemistry module with 64 chemical species, among them four secondary inorganic aerosols (SIAs) -namely ammonium sulfate, ammonium nitrate, other sulfates, and other nitrates.It also contains a module with 16 primary aerosol variablesnamely sea salt, elemental carbon (EC), organic carbon (OC), and dust particles, each emitted in four different size bins.Thus, the model contains 20 different aerosol variables.The particle-radius ranges of the four bins are as follows: size bin 1: 10-50 nm; size bin 2: 50-500 nm; size bin 3: 500-1250 nm; size bin 4: 1250-5000 nm.
The model reads in emission data, meteorological data, and land use data and computes transport processes, chemical transformation, and dry and wet deposition of the various trace gases and aerosols.As output, it provides concentration fields of gases and aerosols, the deposition of these chemical species to land and water-covered areas, as well as the temporal evolution of these variables.
We mention that there exists another model version that includes aerosol microphysical processes, such as nucleation, condensational growth, and coagulation.In that model version the aerosol size distribution evolves dynamically.The model has 20 size bins and seven chemical species (EC, OC, dust, sea salt, particulate sulfate (PSOX), particulate nitrate (PNOX), and particulate ammonium (PNHX)), although not all species are encountered in all size bins.The total number of model variables currently in that version is 82.
More complete information about the mass transport model can be found in Andersson et al. (2007).The sea salt module is discussed in Foltescu et al. (2005).The aerosol microphysics module is described in Andersson et al. (2015).
For the sake of simplicity we here use the mass transport model without aerosol microphysical processes (see next section).The model is set up over Europe covering 33 • in the longitudinal and 42 • in the latitudinal direction in a rotated lat-long grid with 0.4 the vertical direction the model domain extends up to 13 hPa, using 40 terrain-following coordinates.The meteorological input data are taken from the numerical weather prediction model HIRLAM (Undén et al., 2002).For the emissions of all aerosol components we used EMEP data for the year 2007, where EC and OC emissions were computed from total primary particle emissions based on the data in Kupiainen andKlimont (2004, 2007).

Aerosol optics model
We have two different optics models coupled to MATCH: one to the mass transport module, and another to the aerosol microphysics module.The former assumes that all aerosol species are homogeneous spheres, and that each chemical species is contained in separate particles.Under these assumptions the optics model is linear, i.e. the optical properties are linear functions of the concentrations of the chemical aerosol species.The latter model accounts for the fact that in reality different chemical species can be internally mixed, i.e. they can be contained in one and the same particle.That model also accounts for the inhomogeneous internal structure of black carbon mixed with other aerosol components, and for the irregular fractal aggregate morphology of bare black carbon particles (Kahnert et al., 2012a(Kahnert et al., , 2013)).Under these assumptions the optics model becomes nonlinear, which introduces additional complications in the inversemodelling problem.This is the main reason why we chose to use the simpler mass transport optics model in this study.Much of the theory explained in the Appendix B relies on the assumption that the optics model is either linear, or that it is only mildly nonlinear, so that it can be linearized -see Eq. (B6).Table 1 lists the refractive indices in the mass-transport optics model at the three lidar wavelengths considered in this study.More information about the aerosol optics models implemented in MATCH can be found in Andersson and Kahnert (2016).

Three-dimensional variational data assimilation (3DVAR)
Data assimilation is a class of statistical methods for combining model results and observations.The algorithm weighs these two pieces of information according to their respective error variances and covariances.As output the assimilation returns a result in model space of which the error variances are smaller than those of the original model estimate.
In our case the model variables are the mass mixing ratios of aerosol components in a three-dimensional discretized model domain.These model variables are summarized in a vector x.
The model provides us with a background (or first guess) estimate x b (with an error b ).The observations, summarized in a vector y, are related to the model state x by where Ĥ is known as the observation operator, and o denotes the vector of observation errors.The problem is to determine the most likely state vector x a given x b and y, and given the background error covariance matrix B = b • T b , and the observation error covariance matrix R = o • T o .Here • • • denotes the expectation value.In the threedimensional variational method (3DVAR), the maximumlikelihood solution is found by numerically minimizing the cost function Data assimilation is commonly employed for constraining model results by use of observations.However, one can also employ data assimilation as an inverse-modelling tool, i.e. for retrieving a model state from measurements.A summary of the theoretical basis of variational data assimilation is given in Appendices B-D. 2  The MATCH model contains a 3DVAR data assimilation module.This model uses a spectral method, i.e. the model state vector is Fourier transformed in the two horizontal coordinates.All error correlations in the horizontal direction are assumed to be homogeneous and isotropic.The background error covariance matrix is modelled with a method that follows similar principles to the NMC method (Parrish and Derber, 1992).A more complete description of our 3DVAR program can be found in Kahnert (2008).

Analysis of the information content of aerosol optical parameters
The questions we ask are these: 1. Suppose we have an n dimensional model space.Given m observations (e.g.m 1 different parameters at m 2 different wavelengths, so that m 1 • m 2 = m), how many independent model variables N ≤ n can we constrain with the observations?Obviously, the best we can achieve would be N = min{m, n}, but often we will have N < min{m, n}.
2. Which are the N model variables (or linear combinations of model variables) that can be constrained by the measurements?
2 Many authors distinguish between data assimilation and data analysis.In data analysis one merely post-processes a model results by incorporating the information provided by observations.In data assimilation, the data analysis process is part of the time integration of the CTM.Thus, in each time step the result of the analysis becomes the new initial state for the next model forecast.Our 3DVAR code can be used in either analysis or assimilation mode.However, in this study we only perform numerical tests at a fixed point in time.Thus we use the 3DVAR code as a data analysis tool.Here we only give a summary of the most essential theoretical tools for answering these questions.A more thorough explanation of these concepts is given in Appendix C. First we want to explain what we mean by signal degrees of freedom and noise degrees of freedom, closely following an example in Rodgers (2000) (p.29f).Suppose we have a direct measurement y of a scalar variable x with error o , i.e. (3) Suppose further that we have a background estimate x b with background error variance σ 2 b , and that the error o has variance σ 2 o .The prior variance of y is given by σ 2 y = σ 2 b + σ 2 o , assuming that background and observation errors are uncorrelated.One can show that the best estimate x a of x will be Hence, if σ 2 b σ 2 o , then the measurement y will provide information for estimating x a , i.e. the measurement provides a degree of freedom for signal.However, if σ 2 b σ 2 o , then x a will be close to x b , and y provides little information to estimating x a .The measurement mostly contains information on o , i.e. it provides a degree of freedom for noise.
In a more general case we have to consider a state vector x and a set of measurements y with errors o .The number N s of signal degrees of freedom is a measure for the information content of the set of measurements.It provides us with an estimate of the number N of model variables that can be controlled by assimilating measurements.
The mapping from model space to observation space given in Eq. ( 1) can be Taylor expanded to first order according to where Ĥ is the observation operator, H denotes its Jacobian, and δx = x − x b .The background or prior estimate x b is often obtained from a model run.The (in general non-square) matrix H is the main quantity we need to investigate in order to address the questions formulated at the beginning of this section.It is transformed to the so-called observability matrix , where R is the observation error covariance matrix, and B denotes the error covariance matrix of the background estimate.Subsequently, one performs a singular-value decomposition (SVD) where the matrices V L and V R contain the left and right singular vectors, respectively, and W is a matrix that contains the singular values along the main diagonal, while all other matrix elements are zero.It turns out that the singular values w i can be employed to compute the number of signal degrees of freedom N s according to Another useful measure is obtained by expressing our incomplete knowledge of the atmospheric aerosol state by use of the Shannon entropy.The use of measurement information reduces the entropy, and this entropy reduction H can be expressed in terms of the singular values: Both N s or H allow us to quantify the information content of a set of measurements.More detailed explanations of these concepts are given in Appendix C. A comprehensive discussion of information aspects and inverse methods for atmospheric sounding can be found in Rodgers (2000).
By performing the transformation we go from our physical model space to an abstract phase space -see Eq. (C16) in Appendix C. In this phase space the components of δx can be separated into signal-related and noise-related variables.The signal-related components can be controlled by the measurements, the noise-related components cannot.We therefore introduce constraints into our 3DVAR program such that only the N s signal-related components of δx are allowed to be adjusted in the data analysis procedure, while the noise-related components are not altered.This is accomplished by adding an extra term J G to the cost function in Eq. ( 2), where and where B G is a diagonal matrix which we assume to have the form Here K = min{n, m}, and the number c is assumed to be much smaller than the smallest singular value.We note that the formulation of the constraint term in Eq. ( 11) is by no means unique.Other possible choices of the matrix B G are discussed in Appendix D3.However, we performed preliminary tests which indicate that the constrained 3DVAR approach is not very sensitive to exactly how one chooses to formulate the matrix B G , as long as it behaves in such a way that the noise-related phase-space variables are tightly constrained, while the signal-related variables can be varied relatively freely by the analysis.The free parameters σ G and c should be tuned in such a way that the constrains are neither too hard nor too soft.In the former case, the analysis will stay too close to the background estimate.In the latter case, it will not differ much from the unconstrained analysis.

Numerical test of the constrained assimilation algorithm
We study the performance of the 3DVAR system by performing a numerical test.To this end, we first perform a reference run by driving the MATCH model with analysed meteorological data.These reference results are taken as the "true" chemical state of the atmosphere.We apply the optics model to the model output to generate synthetic "observations", i.e. a vertical profile at a selected observation point of extinction and backscattering coefficients at three typical lidar wavelengths.Next we run the MATCH model again, this time driven with 48 h forecast meteorological data.The results are taken as a proxy for a background model-estimate that is impaired by uncertainties.Finally, we perform a 3DVARanalysis of the "observations" and the background estimate in an attempt to restore the reference results.In this numerical test we have perfect knowledge of the true state, and we assume that our optics model is nearly perfect, thus providing nearly perfect observations (we assumed that the observation error standard deviation is 10 % of the measurement value).
The only factor that may prevent us from fully restoring the reference state is a lack of information in the observed parameters.Thus, comparison of the retrieval and reference results gives us an indication of how strongly different model variables can be controlled by the information contained in the observations.We perform this test (i) with the unconstrained 3DVAR algorithm and (ii) with the constrained 3DVAR algorithm.We compare both runs in order to make a first assessment of the impact of the constraints.In particular, we are interested in the prospect of reducing the risk of assimilating noise in such a highly under-constrained inverse problem.

Analysis of the information content of aerosol optical parameters
We consider the set of parameters {k ext (λ 1 ), k ext (λ 2 ), β sca (λ 1 ), β sca (λ 2 ), β sca (λ 3 )}, where k ext and β sca denote the extinction and backscattering coefficients, respectively, and the wavelengths λ 1 = 1064 nm, λ 2 = 532 nm, and λ 3 = 355 nm denote the first three Nd:YAG harmonics.Hereafter, we will abbreviate these parameters by Out of this fiveparameter set we pick different subsets and analyse the singular values of the corresponding observability matrices.From those we compute the number of signal degrees of freedom as well as the change in Shannon entropy for each subset of measurements.We will focus on those parameter subsets that are technically relevant in practical lidar applications.
Table 2 shows the number of signal degrees of freedom N s and the reduction in Shannon entropy H for different values of the observation standard deviation σ o .For low values of σ o , the number of signal degrees of freedom is identical to the number of observational parameters.However, as we increase σ o we observe a decrease in N s .For instance, for σ o = 100 % the five parameters The reduction in Shannon entropy H displays an analogous behaviour.For instance, for σ o = 1 % we see that H consistently increases as one increases the number of observational parameters.This is much less pronounced for σ o = 100 %.In that case, H does increase as one goes from a single parameter to two parameters (compare the first to the second and fourth rows).However, as one adds more parameters, the increase in H slows down considerably.For five parameters (last row), H is only about twice as high as for a single parameter (first row).
This illustrates the pivotal importance of the observation error for the amount of information that can be obtained from measurements.It is important to understand that the observation error o is not the same as the measurement error m .Rather, in our case we have o = m + f , where f denotes the forward-model error (see, e.g., Eq. 1 and accompanying text in Rabier et al., 2002).Any simplifying assumptions in the optics model or incomplete knowledge of the particle size distribution, morphology, chemical composition, or dielectric properties can contribute to f .Such assumptions enter into our relatively simple optics model. 3Note also that in operational applications there may be other terms contributing to o .For instance, if a point measurement is taken at a location that does not provide a good representation of the grid-cell average, then one would have to add a representativity error r to the observation error.
The strong impact of the observation errors on the information content of measurements suggests two conclusions.
1.In order to make the forward-model error f as small as possible, it is essential to develop accurate and realistic aerosol optics models.The most accurate measurements may intrinsically contain a wealth of information on aerosol properties.But we can only make use of this information to the extent that our observation operator is able to accurately describe the relation between the physical and chemical particle characteristics and their optical properties.
2. It is equally essential to accurately estimate the contribution of the uncertainties in the aerosol optics model, i.e. to estimate the forward-model error f .If we underestimate this error, we will rely too much on the measurements than we should, thus assimilating noise.If we overestimate this error, we will waste information contained in the observations.In practice, one way to estimate f is to compute optical properties while varying the particles' size, morphology, and dielectric properties within typical ranges.The resulting variation in the optical properties then allows us to estimate f .(For a review of aerosol optics modelling see Kahnert et al., 2014, 2016, andreferences therein).
In Table 2 we sorted the results for N s and H by different values of the observation standard deviation.However, it is important to realize that the results also depend on the background error standard deviation, or, more precisely, on how large the background error standard deviations are compared to the observation error standard deviations.Johnson et al. (2005a) made this point very explicit.They discussed an idealized case with diagonal background error covariance matrix B = σ 2 b 1 and observation error covariance matrix R = σ 2 o 1.They considered the case of direct measurements, i.e. the model variables and the observed parameters are the same type of variables.Under such idealized conditions, they showed that one can maximize the amount of information that can be obtained from the observations by optimizing the regularization parameter σ b /σ o (or, equivalently, the regularization parameter σ 2 o /σ 2 b ).In our more general case, instead of σ b we need to consider the full matrix B 1/2 , instead of σ −1 o we need to consider R −1/2 , and in order to compare the two matrices we need to first transform B 1/2 from model to observation space according to H • B 1/2 .Thus, in place of σ b /σ o we need to consider the more general quantity R −1/2 • H • B 1/2 , and we need to diagonalize it by a singular value decomposition according to Eq. ( 6).Thus the singular values w i generalize the parameter σ b /σ o .The latter applies to the case of direct observations and error covariance matrices that are proportional to unit matrices.The former apply to the general case of non-diagonal error covariance matrices and indirect observations.
From this we learn that the singular values w i provide us with a (however abstract) means to quantify how the background standard deviations compare to the observation standard deviations.We pick one of the columns in Table 2, namely the one for σ o = 50 %, and expand it in Table 3.We show the singular values w i , as well as their contributions N i s = w 2 i /(1 + w 2 i ) and H i = 0.5log 2 (1 + w 2 i ) to the sums in Eqs. ( 7) and (8), respectively.The results reveal that the singular values w i can decrease quite rapidly from the largest to the smallest value (see, e.g., case no.6 in the table).However, the corresponding contribution N i s to the number of signal degrees of freedom changes rather smoothly.Even those singular values that are only slightly larger than 1 make contributions N i s that lie close to 1 (see, e.g., i = 4 in case no. 6).However, once w i falls below 1, the corresponding contribution N i s becomes much smaller than 1 (see i = 5 in case no. 6).
Let us now compare the different subsets of parameters in Tables 2 and 3.In case no. 1 we observe a single parameter that provides a single degree of freedom.In cases no. 2 and 4 we observe two parameters, which nearly doubles N s .Comparison of these two cases shows that it does not make a significant difference whether we observe backscattering coefficients at different wavelengths, or both extinction and backscattering coefficients each at a single wavelength.In either case the measurements provide roughly the same amount of information (in terms of N s or H ). The same is true when considering three observational parameters (compare case nos. 3 and 5).The 3β + 2α case (no.6) clearly provides the largest amount of information in comparison to the other cases.However, as we saw in Table 2, observation errors that are large in comparison to the background errors can significantly reduce the effective information that can assimilated into a model.

Numerical inverse-modelling test
We integrated the findings of Sect.3.1 into our 3DVAR program by constraining the algorithm to varying only the signal-related model variables.To illustrate the method we conduct a numerical test as described in Sect.2.5.We perform a 3DVAR analysis by assimilating "3β + 2α" profiles, i.e. synthetic lidar measurements of β sca at the three wavelengths 1064, 532, and 355 nm together with k ext at the two wavelengths 532 and 355 nm.Thus in our case the number of singular values in each vertical layer is K = 5.We assume an idealized situation in which the observation standard deviation is only 10 %.As we see in Table 2 (case no.6), the number of signal degrees of freedom is N s = 4.9 in this case.Thus, we roughly have as many signal degrees of freedom as we have measurements.
Figure 1 shows vertical profiles of selected aerosol components, namely (from top to bottom): organic carbon (OC) in the third size bin (OC-3), OC in the fourth size bin (OC-4), elemental carbon (EC) in the third size bin (EC-3), and mineral dust in the first size bin (DUST-1).The reference and Figure 1.Vertical profiles of selected aerosol components in different size bins.From top to bottom: organic carbon in the third size bin (OC-3), OC in the fourth size bin (OC-4), elemental carbon in the third size bin (EC-3), and dust in the first size bin (DUST-1).The reference results are shown in black, and the background (first guess) estimate is shown in green.The unconstrained 3DVAR analysis results are presented in the left panels in blue, the constrained 3DVAR analysis results are shown in the right panels in red.
background mixing ratios are shown in black and green, respectively.The 3DVAR analysis was first performed without any constraints; the results are shown in the left column by the blue line.Then the 3DVAR analysis was repeated with the constraints in Eqs. ( 10) and (11); the results are represented in the right column by the red line.Clearly, the unconstrained analysis (blue lines in the left panels) yields results that oscillate quite erratically in the vertical direction.Also, the unconstrained analysis can yield conspicuously high values at higher altitudes, even though the reference and background values are both close to zero.By contrast, the constrained analysis (red lines in the right panels) yields results that better agree with the reference results.The noisiness in the vertical direction is significantly reduced, and the results at higher altitudes are generally lower than those obtained with the unconstrained analysis.
Figure 2 shows analogous results for the mass mixing ratios of different aerosol components, each summed over all size bins.The aerosol components are (from top to bottom): elemental carbon (EC), organic carbon (OC), mineral dust (DUST), sea salt (NaCl), secondary inorganic aerosols (SIA, i.e. the sum over all sulfate, nitrate, and ammonium species), and PM 10 (i.e. the sum over all aerosol components).Clearly, the constrained analysis faithfully retrieves both PM 10 and SIA.The unconstrained analysis performs almost equally well for these two variables.Sea salt and mineral dust are not well retrieved from the measurements in either the con- Figure 2. As Fig. 1, but for the total mass mixing ratio (summed over all size bins).The components are (from top to bottom): EC, OC, mineral dust, sea salt, secondary inorganic aerosols (sum of all sulfate, nitrate, and ammonium species), and PM 10 (sum of all aerosol components).strained or unconstrained approach.EC and OC are very well retrieved by the constrained analysis.For these components, the unconstrained analysis has a very small bias compared to the reference results, but it is considerably more noisy (i.e.oscillating in the vertical direction) than the constrained analysis.We also see, again, that the mixing ratios at higher altitudes obtained with the unconstrained analysis can be unreasonably high.This is especially pronounced for OC.In general, however, the problems we encounter in the unconstrained analysis are less pronounced in Fig. 2 than in Fig. 1.A possible explanation is that SIA may be most strongly related to the measurement signal, and SIA is dominating the aerosol mass in this case.We will return to this point shortly.Another possible factor is that the noise in the analysis can be damped by summing up results over several size bins.Figure 3 shows the observations (black) as well as the observation-equivalents of the background estimate (green) and the unconstrained (blue) and constrained (red) 3DVAR analysis for all five observations.We learn from this figure that the analysis follows the observations faithfully.The reason for this is that we assumed that the observations were highly accurate with an error standard deviation of only 10 %.In fact, the difference between the observationequivalent analysis and the observations deviate by even less than 10 %.However, our tests confirmed that an increase in the observation error eventually results in analysis results of which the observation-equivalent increasingly deviates from the observations (not shown).
We have seen that the analysis provides a reasonable, but, as expected, not a perfect answer to the inverse problem.We have further seen that at the observation site it relies more on the observations than on the background estimate.Most importantly, we have seen that the constraints introduced in the 3DVAR algorithm suppress noise in the analysis, especially in EC and OC.However, the previous figures do not provide us with any direct insight of how exactly the constraints accomplish this.To learn more about that we need to inspect the analysis in the abstract phase space of the transformed model variables δx .(Recall that we defined this variable in Eq. 9 as Figure 4 shows vertical profiles of a selection of the, in total, 20 variables δx i .The background estimate corresponds to δx i = 0 and is represented by the green line.The unconstrained 3DVAR analysis increment is represented by the blue line, the constrained 3DVAR analysis increment is shown by the red line.The first five phase space elements in the top row are the signal-related control variables.Generally, the magnitude of the constrained increments (red) is larger than that of the unconstrained increments (blue).The noise-related phase space elements, five of which are shown in the bottom row, display the opposite behaviour.The constrained increments are close to zero, as they should be.The unconstrained elements consistently show higher magnitudes than the constrained elements.However, we also see that the unconstrained analysis does produce increments that are largest for the two elements δx 1 and δx 2 , which most strongly relate to the measurement signal.Based on our single test case we cannot say if this is a lucky coincidence or a consistent property.If the latter, it may indicate that we are using rather reasonable background error statistics, so that the analysis increment in observation space is distributed to the different variables in model space in a sensible way.If the former, it could be the case that the success of the unconstrained analysis is largely dependent on whether or not those aerosol components dominate the total aerosol mass that most strongly relate to the signal degrees of freedom.(In our case the total mass is dominated by SIA, which is very well retrieved by the analysis).
Finally, we want to obtain a better understanding of how the aerosol components x in model space, or their increments δx, are linked with the signal-related phase-space elements δx .To this end we inspect the first five row vectors of the transformation matrix V T R • B −1/2 in Eq. ( 9).The magnitude of these elements can be taken as a measure for how much each aerosol component of δx in model space contributes to the signal-related elements of δx .Figure 5 shows . ., 5, and for j = 1, . .., 20, where 5 is the number of signal-related phase-space elements, and 20 is the number of aerosol components in model space.Results are shown for model layers 2 (left column) and 22 (right column), which correspond to altitudes of about 100 m and 6 km, respectively.The x axis shows sea salt (NaCl), EC, OC, and dust, each in four size bins, as well as the four SIA components, i.e. sulfates (SO x ) other than (NH 4 ) 2 SO 4 , ammonium sulfate (AS), ammonium nitrate (AN), and nitrates (NO x ) other than NH 4 NO 3 .
Comparison of the two columns clearly demonstrates that the elements of the transformation matrix can vary considerably with vertical layer (or, more generally, with location).This is because the error covariance matrix B varies with location, and the matrix R varies from one observation site to another (in our case, from one altitude to another).Hence the matrix V R is also dependent on location -see Eq. ( 6).Consequently, it is very difficult to draw general conclusions about which aerosol components make a dominant contribution to the signal-related phase-space variables; this can vary with location, and it can vary for different data sets.
However, in our case the SIA components consistently make a strong contribution to the first signal-related element δx 1 .Since SIA is dominating the aerosol mass mixing ratio in this test case, the analysis was able to retrieve PM 10 .We also see that the dust components make only a weak contribution to most of the signal-related elements δx i , especially to the first one.This is a likely explanation for the difficulties encountered in retrieving the dust mass mixing ratio.Sea salt is more complicated.Size bins 3 and 4 do contribute considerably to δx 1 , and also to some of the other four increments, while size bins 1 and 2 do not make a significant contribution to most of the five signal-related control variables.In our case the sea salt mass is strongly dominated by the second size bin (not shown).This explains the difficulties we encountered in the retrieval of sea salt.

Summary and conclusions
We have quantified the information content of multiwavelength lidar measurements with regard to the chemical composition of aerosol particles.Different combinations of extinction and backscattering observations at several wavelengths have been investigated by determining the singular values of the scaled observation operator, by computing the number of signal degrees of freedom N s , and by calculating the reduction in Shannon entropy H caused by taking measurements.We first quantified N s and H as a function of observation standard deviation σ o .The information content of the observations, as expressed by N s and H , decreased as σ o was increased.This became the more pronounced the larger the number of simultaneously observed parameters was.
The observation error depends not only on the measurement error, but also on the forward-model error.The latter depends on the uncertainties in the aerosol optics model.This highlights the importance of developing accurate aerosol optics models and of obtaining an accurate estimate of the observation error, especially of the uncertainty in the aerosol optics model.This is a prerequisite for extracting as much information as possible from the measurements, while avoiding to extract noise rather than signal.More often than not, computational limitations and lack of knowledge force us to introduce simplifying assumptions about the particles' morphologies.However, we know that aerosol optical properties can be highly sensitive to the shape (Mishchenko et al., 1997;Kahnert, 2004), small-scale surface roughness (Kahnert et al., 2012b), inhomogeneity (Mishchenko et al., 2014;Kahnert, 2015), aggregation (Fuller and Mackowski, 2000;Liu and Mishchenko, 2007;Kahnert and Devasthale, 2011), irregularity (Muinonen, 2000;Bi et al., 2010), porosity (Vilaplana et al., 2006;Lindqvist et al., 2011;Kylling et al., 2014), and combinations thereof (Lindqvist et al., 2009(Lindqvist et al., , 2014;;Kahnert et al., 2013).We need to know how much these sources of uncertainty contribute to the observation standard deviation.One way of estimating this is to compare aerosol optical properties computed with simple shape models to either measurements or to computations based on more realistic particle shape models -see Kahnert et al. (2014) for a recent review and a more detailed discussion.
The singular values of the scaled observation operator provide us with an abstract measure to compare the standard deviations of the background (prior) estimate to those of the observations.The reason why this is a rather abstract measure is because background and observation errors are, in general, in different spaces and cannot be directly compared.However, we constructed a mapping that transforms the state vector in physical (model) space to an abstract phase space in which the components of the state vector can be partitioned into signal-related and noise-related components.The singu-lar values indicate to what extent the signal-related phasespace variables can be constrained by the measurements.We exploited this fact by constructing weak constraints in a 3DVAR data assimilation code, which limited the assimilation algorithm to acting on the signal-related phase-space variables only (hereafter referred to as the constrained analysis).The idea was to maximize the use of information, while avoiding the risk of assimilating noise by overusing the measurements.Thus, our main hypothesis was that the constrained analysis will yield less noisy results than the unconstrained analysis.Numerical tests confirmed this hypothesis.Notably in the case of elemental carbon (EC) and organic carbon (OC) the unconstrained analysis gave mixing ratios that oscillated considerably in the vertical direction.The constrained analysis results were considerably less noisy.
When mapped into observation space, the analysis result closely reproduced the measurements.When viewed in the abstract phase space, we found that the constrained analysis did, indeed, yield noise-related components that were close to zero, as they should be.This was not so in the unconstrained analysis.Also, the magnitude of the signalrelated phase-space components was generally larger in the constrained analysis than in the unconstrained analysis.This confirms that the constraints we introduced work as intended.
In our specific test case secondary inorganic aerosol components were most faithfully retrieved by the inverse modelling solution, followed by organic and black carbon.Dust and sea salt mass mixing ratios were more challenging to retrieve.We could explain this by inspecting the linear coefficients in the transformation from physical space to the abstract phase space.We found that those aerosol components that had the largest weight in the transformation were most faithfully retrieved by the analysis.However, these linear coefficients depend on the background error covariances (which can change with location), and on the observation error variances.Therefore, it is difficult to draw general conclusions about which aerosol components are most easily retrieved by a given set of measurements.
The results presented here suggest further questions for future studies.We have performed this investigation with a mass transport model, thus focusing on the information content of optical measurements with respect to the chemical composition of aerosols.When we include aerosol microphysical processes, then the model delivers the aerosols' size distribution, as well as their size-resolved chemical composition.This makes the problem quite different from that we investigated here.First, the dimension of the model space is considerably larger for an aerosol microphysics transport model.Constraining such a model with limited information from measurements becomes even more challenging than in the case of a mass transport model.On the other hand, an aerosol microphysics model delivers information on the particles' size distribution and mixing state.Therefore, this would require us to make fewer assumptions in the aerosol optics model, which may reduce the observation error.The present study could be extended to investigate the information contained in extinction and backscattering measurements for simultaneously constraining the chemical composition and the size of aerosol particles.
Another important issue concerns the choice of the aerosol optics model.In the present study we employed a simple homogeneous-sphere model in which all chemical components were assumed to be externally mixed.There is little one can put forward in defence of this model other than pure convenience.(Regarding the applicability of simplified model particles in atmospheric optics see the review by Kahnert et al., 2014).As a result of the external-mixture assumption, the observation operator is linear, which is a prerequisite for much of the theoretical foundations of this study -see Appendices B-D for details.However, it has been demonstrated that drastically simplifying assumptions, such as the external-mixture approximation, can give model results for aerosol optical properties that differ substantially from those obtained with more realistic nonlinear optics models (Andersson and Kahnert, 2016).It would therefore be important to extend the present study to include more accurate and realistic optics models.A first step could be to analyse the degree of nonlinearity of optics models that account for internal mixing of different aerosol species.If they turn out to be only mildly nonlinear, then one can linearize them and work with the Jacobian of the nonlinear observation operator.Otherwise the theoretical methods employed in this paper would have to be extended in order to accommodate nonlinear observation operators.
Data availability.The data used in this study are included in the Supplement.background estimate is more commonly referred to as the a priori estimate. 5The optics model Ĥ usually has to invoke assumptions about physical aerosol properties that are relevant for the optical properties, but not provided by the CTM output, e.g.assumptions about the morphology of the particles.If the CTM is a simple mass-transport model without aerosol microphysics, then it is also necessary to invoke assumptions about the size distribution of the aerosols. 6We stress, once more, that the observation error must not be confused with the measurement error m .The latter contributes to the former, but the observation error also contains other sources of error.For instance, if we deal with morphologically complex particles, but our lack of knowledge forces us to make assumptions and invoke approximations about the particle shapes, then this forwardmodel error f contributes to the observation error.The same is the case if we lack information about the particles' size distribution.In operational applications the representativity error r can also make a substantial contribution to o .system is found in state x, given measurements y with error covariances R. 7Equations ( B1)-(B3) can be summarized in the form where J is suggestively called the cost function, since it can be interpreted as a measure for how "costly" it is for a state x to simultaneously deviate from the background estimate and the measurements within the permitted error bounds.The deviations are weighted with the inverse error covariance matrices.For instance, this means that for measurements with a small error variance, a deviation Ĥ (x) − y becomes "more costly".We are interested in the most probable aerosol state of the atmosphere, i.e. in that state x a for which the probability distribution attains its maximum.This is obviously the case when the argument of the exponential in Eq. (B4) assumes a minimum.Thus we seek to minimize the cost function J .The variational method is based on computing the gradient of the cost function, ∇ x J , and to use this in a descent algorithm to iteratively search for the minimum of J .
In practice it is common to introduce the variable δx = x − x b , and use the first-order Taylor expansion of the observation operator, where the (m × n) matrix H denotes the Jacobian of Ĥ at x = x b .If Ĥ is only mildly nonlinear, and if the components of δx are sufficiently small, then we can substitute this firstorder approximation into Eq.(B5), which yields The components of the vector δx are the control variables that are iteratively varied by the algorithm until the minimum of the cost function is found.
The solution to the equation ∇ x J = 0 n is a solution to the inverse problem (where 0 n denotes the null vector in ndimensional model space); we input the observations y into the algorithm, and as output we obtain a result in model space that is consistent with the measurements (within the given error bounds).8What if the measurements contain insufficient information about the state x?The algorithm will still provide an answer to the inverse problem, but the missing information will be supplemented by the background estimate x b .The weighting of the two pieces of information, x b and y, is controlled by the respective error covariance matrices.Thus data assimilation is a statistical approach, which can be expected to give good results on average, but not in every single time step of the model run.This can become highly problematic if we only have very few observations, i.e. m n, where n is the dimension of the model space.If we allow all model variables to be freely adjusted by the assimilation algorithm in such a severely under-constrained case, then the algorithm may just assimilate noise from the measurements rather than signal, resulting in unreasonable solutions to the inverse problem (e.g.Kahnert, 2009).To avoid such problems, one needs to systematically analyse the information content of the observations and constrain the assimilation algorithm to only operate on the signal degrees of freedom.

Appendix C: Information content of measurements
Our ultimate goal is to formulate the data assimilation problem in such a way that the information contained in the measurements is fully exploited, but not overused.To this end, we first need to know how many independent quantities can be determined from a specific set of measurements.We investigate this question by borrowing ideas from retrieval and information theory -see Rodgers (2000) for more detailed explanations.
The main idea is to compare the variances of the model variables to those of the observations.Only those model variables whose variance is larger than those of the observations can be constrained by measurements.However, to actually make such a comparison poses two problems.The first problem is that one cannot readily compare error covariance matrices.The second problem is that model variables and measurements are in different spaces.We first address the second problem.
When we account for observation errors o , then the basic relation between model variables and observations is, to first order The error covariance matrices are given by the expectation values B = δx • δx T , and R = o • T o , where the dot denotes a dyadic product.9From Eq. (C1) we see that the covariance matrix of δy = y − Ĥ (x b ) is given by δy , where we assumed that background and observation errors are uncorrelated.This last equation suggests that we can compare model and observation errors in the same space by transforming the background error covariance matrix from the space of (n × n) matrices to the space of (m × m) matrices, namely H • B • H T .
To address the first problem, we diagonalize the covariance matrices by making the following change of variables: Here B 1/2 denotes the positive square root10 of the matrix B, and B −1/2 denotes its inverse.The scaled observation operator H is sometimes referred to as the observability matrix.In the new basis, the cost function in Eqs.(B7)-(B9) becomes The covariance matrices are now unit matrices.This can also be seen by considering the transformed errors, e.g.o (Here, 1 m×m denotes the unit matrix in m-dimensional observation space.)Similarly, we find δ x • δ x T = 1 n×n .The covariance matrix of the transformed measurement vector δ y is given by δ y•δ y T = H• H T +1 m×m .The first term is the model error covariance term transformed into observation space, while the second term (the unit matrix) is the diagonalized observation error covariance matrix.
We are still not in a position to make a meaningful comparison of model and observation errors, since the first term, H • H T , is still not diagonal.To make it so we need to perform one more transformation.To this end, we consider the singular value decomposition of the matrix H: Here H is a (m × n) matrix, the matrix of the left-singular vectors V L is a (m × m) matrix, the matrix V R containing the right-singular vectors is a (n × n) matrix, and the (m × n) matrix W consists of two blocks.If m < n, then the left block of W is a (m × m)-diagonal matrix containing the m singular values w 1 , . .., w m on the diagonal; the right block is a (m × (n − m))-null matrix.Similarly, if m > n, then the upper block of W is a (n × n)-diagonal matrix containing the n singular values on the diagonal, while the lower block is a ((m − n) × n)-null matrix.We now make another change of variables: The matrices V L and V R are orthogonal, i.e.V T L •V L = 1 m×m , and similarly for V R .Thus, substitution of Eqs.(C7)-( C9) into (C5) yields Evidently, the transformation given in Eqs.(C7)-(C9) preserves the diagonality of the background and observation error covariance matrices.What about the covariance matrix δy •δy T in the new basis?
• o , as well as Eqs.(C1), (C2)-(C4), and (C6)-(C9), we obtain δy • δy T = H • H T + 1 m×m .The contribution of the background error covariances in this coordinate system is H •H T , which is a diagonal matrix.This becomes clear from Eqs. (C6) and (C9), which yields which is a (m × m) diagonal matrix.Thus in this coordinate system we can readily compare the diagonal elements of the transformed background error covariance matrix H • H T to the diagonal (unit) elements of the observation error covariance matrix 1 m×m .Roughly, those singular values w i on the diagonal of W that are larger than unity correspond to model variables δx i that can be controlled by the measurements.Those singular values smaller than unity correspond to model variables that are only related to noise.
In the above discussion we relied on plausibility arguments.We mention that there are more systematic ways of approaching the problem.Here we merely state some key results without going into details.The interested reader is referred to chapter 2 in Rodgers (2000).However, in all approaches the main quantities of interest are always the singular values of the observability matrix R −1/2 • H • B 1/2 .One can compute the number of signal degrees of freedom N s from the expectation value of J b in Eq. (B8).The result can be expressed in terms of the singular values w i of the observability matrix: where n is the dimension of model space, and m is the dimension of observation space.
Another approach is based on information theory.Given a system described by a probability distribution function P (x), one defines the Shannon entropy S(P ) = − P (x)log 2 P (x) P 0 (x) dx, (C13) where P 0 is a normalization factor needed to make the argument of the logarithm dimensionless.A decrease in entropy expresses an increase in our knowledge of the system.For instance, if we initially describe the system by P i (x), and, after taking measurements, by P f (x), then the measurement process has changed the entropy by an amount H = S(P i ) − S(P f ). (C14) In our case, we assume that all errors are normally distributed.In that case, one can show that H can be interpreted as a measure for the information content of a set of measurements.
Our findings so far suggest a general strategy for how to optimize the amount of information that can be extracted from measurements.First, we need to compute the singular value decomposition in Eq. (C6), as well as the transformation given in Eqs.(C2) and (C7), which we can summarize as Then we want to formulate the minimization of the cost function in such a way that only those components of δx are adjusted by the assimilation algorithm that correspond to the largest singular values of the matrix W in Eq. (C6).All other elements of δx should be left alone.In other words, we want to constrain the minimization of the cost function to the subspace of the signal degrees of freedom of the state vector.Thus, in order to implement this idea, we first need to discuss how to incorporate constraints into the theory.

Appendix D: Minimization of the cost function with constraints
In the minimization of the cost function all elements of the control vector δx are independently adjusted until the minimum of J is found.This may not be a prudent approach if the information contained in the observations is insufficient to constrain all model variables.In such a case one should introduce constraints that reduce the number of independent control variables.However, this needs to be done in a clever way; the goal is to neither underuse the measurements (thus wasting available information), nor to overuse them (thus assimilating noise).
For reasons we will explain later we formulate the constraints as weak conditions.However, for didactic reasons as well as for the sake of completeness, we will also mention how to formulate constraints as strong conditions.

D1 Minimization of the cost function with strong constraints
Given k constraints in the form g i (δx) = 0, i = 1, . .., k, the most general way of finding the minimum of J (δx) under the constraints g i is the method of Lagrange multipliers.More specifically, one introduces k Lagrange multipliers λ 1 , . .., λ k and defines the function where ∇ = ∇ δx 1 ,...,δx n ,λ 1 ,...,λ k is now a (n + k)-dimensional gradient operator, and where 0 n+k denotes the null vector in an (n+k)-dimensional space.Note that in this general formulation of the problem the constraints can even be nonlinear.We are specifically interested in linear constraints, which can be expressed in the form G • δx = 0 k .Then the constrained minimization problem becomes Compared to the unconstrained minimization problem, the introduction of k constraints has increased the dimension of the problem from n to n+k.Naively, one may have expected that the dimension would, on the contrary, be reduced to n − k.This is indeed the case if the constraints are linear, and if the function J is quadratic, as is the case in Eqs.(B7)-(B9).To see this, let us first write those equations more concisely in the form Note that the covariance matrices and their inverses are symmetric (i.e.R T = R, etc.) The unconstrained minimization problem requires us to solve the equation ∇ δx J = Q 1 • δx + Q 2 = 0 n .Now we want to minimize the cost function subject to the linear constraints Another aspect concerns the positive square root of the background error covariance matrix, which appears in essential parts of the theory, namely in Eqs.(C6) and (D16).In theoretical developments it is, arguably, didactically expedient to work with the matrix B 1/2 .But in practice there are numerically more efficient formulations.One such approach is discussed in Kahnert (2008) in the context of a spectral formulation of the variational method.The spectral formulation is applied to the full B matrix in order to reduce the dimension of the problem of diagonalizing this matrix.This method is our method of choice in the formulation of the background and observation terms in the cost function given in Eqs.(B8) and (B9), respectively.However, in the formulation of the constraint term given in Eq. (D17) we can substantially reduce the dimension of the matrix B by working in the reduced space in which only the covariances B (α,l) among space.Thus, even when using a spectral formulation of the 3DVAR method, one can still compute the constraints in physical space and add their contributions to J and ∇J .The advantage of this is, as explained above, that the SVD of the observability matrix can be computed in the reduced subspace, which substantially reduces the dimension of the numerical SVD problem.aerosol components are considered.One could compute the matrix (B (α,l) ) −1/2 in Eq. (D17) by diagonalizing the matrix B (α,l) .However, a numerically much more efficient approach is to not work with positive square root, but with the so-called Cholesky decomposition13 of the B matrix, where C u is an upper triangular matrix.Thus the actual algorithm we used for formulating the constrained minimization of the cost function is obtained by replacing in the preceding formulas all incidences of the matrix B 1/2 with the matrix C T u (and, similarly, by replacing the inverse matrix B −1/2 by the inverse of the Cholesky factor, C −T u ).

Figure 3 .
Figure 3. Observations (black solid line), and observationequivalents of the background estimate (green), and of the unconstrained (blue) and constrained (red) 3DVAR analysis.The optical parameters and wavelengths are indicated above each panel.

Figure 4 .
Figure 4. Vertical profiles of the transformed model variables δx , defined in Eq. (9).The figure shows results obtained with the constrained (red) and unconstrained (blue) 3DVAR analysis.

Figure 5 .
Figure5.The first five rows (from top to bottom) of the matrix V T R • B −1/2 at the observation site, and for model layers 2 (left) and 22 (right).The y values are normalized by dividing them by the maximum element.The x axis indicates the aerosol components in model space to which the elements of the row vectors correspond, namely, sea salt (NaCl), EC, OC, and dust, each in four size bins, as well as the four SIA components: sulfates (SO x ) other than (NH 4 ) 2 SO 4 , ammonium sulfate (AS), ammonium nitrate (AN), and nitrates (NO x ) other than NH 4 NO 3 .

Table 2 .
Number of signal degrees of freedom N s and reduction in entropy H as a function of observation standard deviation, taken from the lowest model layer (closest to the surface).Results are shown for different subsets of k 1 , k 2 , β 1 , β 2 , β 3 , where k i and β i represent the extinction and backscattering coefficient, respectively, at the wavelengths λ 1 = 1064 nm, λ 2 = 532 nm, and λ 3 = 355 nm.

Table 3 .
Signal degrees of freedom N s and change in entropy H for the lowest model layer (closest to the surface).Also shown are the singular values w i and their contributions N i s and H i to N s and H , respectively.The results have been obtained by assuming an observation standard deviation of 50 %.