Fast retrievals of tropospheric carbonyl sulfide with IASI

Iterative retrievals of trace gases, such as carbonyl sulfide (OCS), from satellites can be exceedingly slow. The algorithm may even fail to keep pace with data acquisition such that analysis is limited to local events of special interest and short time spans. With this in mind, a linear retrieval scheme was developed to estimate total column amounts of OCS at a rate roughly 104 times faster than a typical iterative retrieval. This scheme incorporates two concepts not utilized in previously published linear estimates. First, all physical parameters affecting the signal are included in the state vector and accounted for jointly, rather than treated as effective noise. Second, the initialization point is determined from an ensemble of atmospheres based on comparing the model spectra to the observations, thus improving the linearity of the problem. All of the 2014 data from the Infrared Atmospheric Sounding Interferometer (IASI), instruments A and B, were analysed and showed spatial features of OCS total columns, including depletions over tropical rainforests, seasonal enhancements over the oceans, and distinct OCS features over land. Error due to assuming linearity was found to be on the order of 11 % globally for OCS. However, systematic errors from effects such as varying surface emissivity and extinction due to aerosols have yet to be robustly characterized. Comparisons to surface volume mixing ratio in situ samples taken by NOAA show seasonal correlations greater than 0.7 for five out of seven sites across the globe. Furthermore, this linear scheme was applied to OCS, but may also be used as a rapid estimator of any detectable trace gas using IASI or similar nadir-viewing instruments.

also confirmed for tropospheric OCS based on a compilation of zenith-viewing ground observations and balloon campaigns (Krysztofiak et al., 2015).
In retrospective analysis, Glatthor et al. (2015) used the limb-viewing Michelson Interferometer for Passive Atmospheric Sounding (MIPAS) instrument to retrieve OCS concentrations at the lowest-most detectable level, 250 hPa, using standard optimal estimation techniques. The compiled results from 2002 − 2012 in 5 • by 15 • latitude-longitude bins showed clear 5 evidence of elevated ocean sources and tropical rainforest sinks that vary with season. However, limb-viewing instruments are not ideal for tropospheric sounding and the 250 hPa level fails to probe the troposphere at high latitudes as the tropopause decreases in altitude from the equator.
Most recently, Kuai et al. (2014) developed an optimal estimation retrieval scheme to estimate OCS amounts using the Tropospheric Emissions Sounder (TES). TES is a nadir-viewing Fourier transform spectrometer (FTS) instrument aboard 10 NASA's Aura satellite that was launched into polar orbit in 2004. TES is similar in many ways to IASI, but with finer spectral and spatial resolution. However, since TES scans global swathes one half the time (staring operations the other half) and there are currently two orbiting versions of IASI, TES has coarser temporal resolution in comparison.
This retrieval first estimates a vertical profile of OCS on many vertical levels and then averages the levels between 900 and 200 hPa, because the degrees of freedom for the signal (DFS) of the profile is less than one when using a prior constraint of 15 20 % OCS variability. The DFS is qualitatively defined to be the number of independent pieces of information that come from the signal rather than the noise (Rodgers, 2000, ch. 2.4). Therefore, only one bulk level of OCS is ever distinguishable and even then it is a weighted combination of the true estimate and the a priori, which was taken by Kuai et al. to be spatially flat across all locations. The OCS retrieval is carried out after the routine retrieval of temperature, H 2 O, O 3 , CO, CO 2 , CH 4 , surface temperature, emissivity, cloud optical depth, and cloud pressure. Only scenes with a cloud optical depth less than 0.5 20 are considered as cloudy scenes further reduce the OCS information content. The OCS retrieval itself then jointly includes CO 2 , H 2 O, surface temperature, cloud optical depth, and cloud pressure in the state vector and uses the posterior covariances from the preprocessed retrieval as the constraints for these extra parameters in the OCS retrieval.
A monthly mean of TES OCS results from June 2006 was published in Kuai et al. (2015), which further validated the concept that direct ocean emissions of OCS are much greater than previously thought (Berry et al., 2013). The published data 25 included retrievals over ocean between ±40 • , because the DFS rapidly fell to values less than 0.5 outside of this range. This means that the majority of the estimates at higher latitudes were dominated by the flat prior OCS field rather than the true OCS concentrations. To put it another way, the uncertainties from an unconstrained retrieval outside of this latitude range would be greater than the prior constraint of 20 % variability. An alternative approach would be to lessen the prior OCS constraint to extend the detectable latitude range, but at the expense of greater uncertainty in the retrieved values. However, the increase in 30 uncertainty can be mitigated by averaging over a greater number of pixels which reduces uncertainty by the square root of the sample size. On the other hand, if the retrieved estimates are mostly a priori from tight systemic constraints, then no amount of averaging changes this fact. Nonetheless, the TES product created by Kuai et al. is the current leading retrieval scheme of tropospheric OCS.

Method description
This section methodically discusses the mathematical framework, formulation, and parameter validation of the retrieval scheme applied to OCS. Caution is advised to not overly compare the presented method to a standard optimal estimation routine based upon iterating a time consuming forward model. The intent of this method is to rapidly estimate OCS in a single step with minimal dependence upon prior assumptions. Retrieval error due to avoiding the residual non-linearities are statistically 5 quantified for reference.

Linear retrieval framework
A forward model (F ) is a numerical construct that represents the physics of how a given state produces an observable quantity.
In this case, F models how electromagnetic radiation propagates through an atmospheric state (x) to yield the radiance observed by IASI (y) with m number of spectral channels. The work presented in this paper uses the Reference Forward Model 10 (RFM) to simulate such spectral radiances (Dudhia, 2016). When the radiative transfer function is sufficiently linear about a reference state vector (x 0 ) of length n, F can be linearised according to Here is the error in the measured signal relative to the forward model and K ∈ R m×n , referred to as both the "weighting function" and the "Jacobian", is defined to be a matrix of partial derivatives such that K ij = ∂F i (x) /∂x j . 15 Solutions to Eq. (1) can be estimated in the optimal estimation framework by considering a linearisation about an a priori reference state (x a ). Estimates of an atmospheric state (x) are given bŷ where G is referred to as the gain matrix (Rodgers, 2000, ch. 4). The covariance matrix of the stochastic error in the measurements is denoted as S . Since raw spectra from a FTS such as IASI are generally uncorrelated, S has zeroes in the off-diagonal 20 elements while the diagonal elements are the variances of the signal at that spectral position. However, because IASI spectra are apodized on-board the satellite (Amato et al., 1998), off-diagonal spectral correlations are thus introduced into S . The term a priori is meant to include both a mean state, x a , and its covariance S a . Inverting S a in Eq. (2) applies a "soft" constraint upon the solution, penalizing estimates that deviate greatly from the atmosphere provided in the prior estimate.
When the probability density function of the atmospheric state is symmetric about the expected value, the posterior covari-25 ance (i.e., the estimated covariance ofx) is found to bê This is a convenient result, because it means that the uncertainties and correlations between retrieved parameters are generated as a by-product of the retrieval process. Eq. (3) also highlights the fact that ifŜ x = S a , then the retrieval has done nothing to improve upon the a priori and is completely insensitive to the estimated parameters.

30
Further diagnostic information about the retrieval is succinctly contained in a unitless n × n matrix known as the averaging kernel matrix (AKM), defined as Using this relation, Eq.
(2) can be rewritten in the more insightful but less practical form, Repeated analysis of A can be unwieldy when developing a retrieval algorithm. Therefore, a scalar "figure of merit" is often desirable that allows for multiple matrices of A to be compared in a straightforward manner. The DFS, as mentioned in Sect. 2.2, is one such possible metric and is calculated by taking the trace of the averaging kernel matrix, 15 Perfectly conditioned non-trivial inverse problems will have DFS values equal to the number of state parameters, n.
With these relations at hand, the proposed retrieval is a direct application of Eq.
(2), where the RFM is represented by F and used to model IASI radiances and create Jacobian spectra (K). Rather than use a climatological static mean value for x a as the linearisation point, the ensemble of 80 atmospheres selected to parametrise the RTTOV forward model (Matricardi, 2008) was used to create a subsequent ensemble of initial states (x a ), model spectra (F (x a )), and gain matrices (G). The model spectrum 20 that most closely matches the observed IASI spectrum is used to select the initialisation point from the ensemble, which will be discussed further in Sect. 3.6. Once again, the point of this process of selecting a model atmosphere from an ensemble is to make the retrieval as linear as possible without iterating the forward model.
Brightness temperature spectra were intentionally used instead of radiance spectra because removing curvature from the Planck function improves the linearity of the retrieval (Rodgers, 2000, Ch. 5.1). The downside to this is that the measurement 25 noise in brightness temperature space (NE∆T) becomes a function of the observation (see the red lines in Fig. 5 for example).
Therefore, the measurement covariances (S ) were adjusted specifically for each atmosphere based on the model spectra when computing the gain matrices. Apodization was modelled in the off-diagonal elements of S according to the discussion in Amato et al. (1998).

30
Identifying OCS spectral features is a straightforward process. Figure 1 shows the spectral range targeted in this study (2000 − 2300 cm −1 ), including the dominant ν 3 rotational-vibrational band of OCS in the thermal infrared centred at 2060 cm −1 . Notice Figure 1. Top: A simulated IASI BBT spectrum from a desert (i.e., low humidity) atmosphere covering the spectral range used in the linear retrieval. Middle and Bottom: Jacobian spectra showing the change in BBT for a 1% increase in volume mixing ratio (VMR) for the gases listed. The CO2 and N2O spectra represent tropospheric perturbations while the remaining four are total column perturbations. The red bar denotes the area between two H2O lines where a large portion of OCS information comes from.
that H 2 O and CO 2 are the primary contaminants here with additional contributions from CO and O 3 . This also shows there are no isolated OCS spectral lines and that the other detectable species must be accounted for explicitly during the retrieval.
The spectral range included in this retrieval is much larger than the OCS spectral band, which runs from 2040 − 2080 cm −1 . This is to provide temperature and contaminating gas information from the spectrum as location specific a priori are not used.
In particular, the CO 2 and N 2 O spectral features are of various line strengths which saturate at different effective altitudes 5 throughout the vertical profile. Since these two gases are well mixed with low natural variability, they provide robust information on atmospheric temperature. In an iterative retrieval, a much narrower spectral region would be used and the additional information would be supplanted by weather specific a priori to save time computing the forward model. Since the forward model is pre-calculated in this method, the added spectral range only increases the number of linear algebra operations.
The spectral characteristics of the observation and the applied constraints determine the vertical sensitivity of the retrieval.   Kuai et al. (2014). However, when the surface ground temperature is significantly warmer than the surface air temperature (positive thermal contrast), then lower-most tropospheric OCS becomes up to three or four times more detectable. This is because thermal contrast between the surface and the atmospheric temperature accentuates spectral absorption or emission features, which makes them easier to distinguish.
3.3 Defining the state vector and prior covariance 5 Even though OCS is the desired target, the intent of the joint retrieval is to simultaneously account for all physical parameters that affect the observed spectrum above the noise level. Mathematically this is handled via the extra rows in the gain matrix (G) and the resulting cross-terms when multiplied by y. If they are not accounted for, then the other physical parameters become biased into the target estimate. If this were to happen then one could not say with confidence whether an OCS enhancement or depletion was actually due to OCS or something else, such as water vapour or surface temperature.

10
With this in mind, the state vector is chosen to be where the superscript indexes the vertical location of the retrieved atmospheric layer, as visualised in Fig. 3, and the absence of a superscript for a gas implies a total column amount. Specifically, the natural logarithm of the volume mixing ratios (VMRs) is retrieved to enforce positivity in all of the gases and dampen the effect water vapour variability may have upon the results. 15 The term T s represents ground surface temperature. Emissivity is not included in the state vector because the emissivity  Jacobian is highly spectrally correlated (> 0.9) with the surface temperature Jacobian and indistinguishable from the other without strict use of a priori. Therefore, considering the surface emission term in the equation of radiative transfer, it is clear that the retrieved quantity is effectively s T s for spectrally grey emissivity. It is important to note that spectrally changing surface emissivity across the range 2040 − 2080 cm −1 is currently not accounted for and may influence the OCS results over land. However, spectral features of solids and liquids tend to be much broader than gases such that a grey approximation may 5 be valid. Another source of error that may be more important than non-grey emissivity is the fact that all atmospheres in the ensemble were modelled with a surface emissivity of 0.99, which neglects downwelling radiation reflected back into the optical path. In both cases, observations over desert will be affected the most with minimal emissivity impact over water and dense vegetation.
The ratio of CO 2 to N 2 O is included instead of the two separately to improve the conditioning of the inverse problem; which 10 means that there is not enough independent information in the measurement to estimate both gases and atmospheric temperature without added constraints. Whilst N 2 O is a low variability gas that does not overlap with the OCS spectral features, the point of including it in the ratio is to account for variations in CO 2 that may affect the OCS estimate. The downside to retrieving a ratio of two gases is that the knowledge of whether the numerator is enhanced versus a depletion of the denominator, and vice versa, is sacrificed for the improved independence of elements in the state vector. As shown in Fig. 3, four bulk layers of atmospheric temperature are retrieved ranging from the lower troposphere through the stratosphere. Additionally, two layers of water vapour are retrieved. The first layer is the lower-most troposphere that primarily accounts for water vapour continuum effects between absorption lines in the Jacobian as a result of self-broadening from The perturbation of the second layer peaks at 600 hPa, but includes contributions from the remaining upward levels of the atmosphere and accounts more for the absorption feature centres.

5
Since the statistical distributions of temperature and water vapour vertical profiles are well known, the resulting estimates can be constrained to scenarios found on Earth where clearly unphysical profiles are excluded with a negligible loss of sensitivity.
Furthermore, since atmospheric temperature and water vapour are physical correlated, it is possible to represent this effect in the prior covariance. Thus, the 80 atmosphere ensemble was vertically binned down to the bulk layers of the retrieval and used to calculate the sample covariance matrix, which includes the cross-state physical correlation terms. The subsequent correlation 10 matrix is shown in Fig. 4 with the standard deviations annotated along the diagonal elements. This sample covariance is then used as a sub-matrix (6 × 6) within the prior covariance (11 × 11) to constrain the water vapour and temperature portion of the retrieval to physical values within the global range. Further correlation terms between the remainder of the state elements are assigned to values of zero. As a caveat, all elements of the state vector, including OCS, are technically constrained with finite values in the diagonal of the prior covariance. This is primarily for the purposes of developing a test-bed iterative retrieval that utilizes the Levenberg-Marquardt method, which will be discussed next. OCS variability in the prior covariance is assigned to be 200%. CO and O 3 variability is assigned 100%, the CO 2 /N 2 O ratio is set to 10%, and surface temperature 20 K. However, this is such a weak constraint that the DFS for the OCS total column is close to one for all atmospheres and, therefore, effectively unconstrained.

Parameter validation using an iterative retrieval
Validation of the retrieval framework, as previously defined, is crucial towards developing confidence in the resulting estimates.
Without analysing external data, one can show using an iterative retrieval that: 1. The estimates converge during iteration.
2. The OCS spectral signature is noticeable in the residual spectrum of the converged result when excluded from the state 10 vector and all other parameters are retrieved.
3. The variability of the converged residual spectrum over many pixels is similar to the expected instrument noise.
Each point is discussed in turn.
The iterative retrieval was written as a test-bed for the faster linear scheme; so the spectral range, state vector, and prior covariance are the same as previously defined. This nonlinear approach is based on the Levenberg-Marquardt method as 15 discussed in (Rodgers, 2000, ch. 5.7). The prior state is taken to be equal to the initial state, which is selected on a pixel by pixel basis from the ensemble of atmospheres. Each model atmospheric spectrum is compared against the measured spectrum and the j th atmosphere that minimizes the spectral cost, i.e., is chosen as the starting point. For atmosphere selection, only the diagonal of S is used to save computation time when rastering 20 through the 80 atmospheres. Scenes with calculated cloud fractions from the Advanced Very High Resolution Radiometer (AVHRR) embedded data (Saunders, 1986) greater than 20% are not included. Based on this methodology it was found that the majority of IASI pixels converged on a result that reduced the χ 2 cost function. In reality, this test for convergence was repeated numerous times as the state vector and prior covariance were modified until settling on the parameters defined in the previous section.

25
OCS signatures can be shown in the converged residual spectrum (IASI minus RFM) if all other contributing parameters are retrieved. This is done by removing OCS from the state vector while retrieving the other 10 in its absence. Figure 5 shows an example of this for an IASI pixel in the North Atlantic off the coast of Iceland where the retrieved surface temperature is 281 K. Notice that the OCS spectral signature is clearly above the IASI noise level for a particularly low surface VMR estimate of 404 ppt and matches well to the predicted OCS residual of the same VMR. It is important to keep in mind that Fig. 5 is for 30 a single pixel without any spectral averaging to reduce instrument noise. Also apparent, is a substantial spike in the residual centred at 2077 cm −1 . This feature is presumably due to line mixing errors within the RFM for the CO 2 Q-branch located at this position. Therefore, these particular channels should be avoided as they are poorly modelled.
Once all physical parameters that contribute to the signal above the noise level are accounted for through the joint retrieval, then the standard deviation of the spectral difference between the observation and the model, i.e., the residual, should be equal to the instrument noise. If this is not the case, then any parameters that are not completely accounted for will show an associated 5 spectral feature in the standard deviation of the residual spectra. To test this posit, the iterative retrieval was run over 600 pixels in a 10 • × 10 • latitude and longitude box in the Equatorial Pacific Ocean. The variability of the sample residuals is shown in Fig. 6 along with the average instrument noise profile in units of brightness temperature. Observe that the variability of the residuals matches closely to the average instrument noise with the exception of a few spectral features due to water vapour ( Fig. 1). Therefore, the retrieval and associated state vector sufficiently account for the noticeable physical parameters aside 10 from water vapour, which could be further resolved with more vertical levels along the profile.
There are three options to pursue with regards to unresolved, but influential, H 2 O levels in the retrieval. Firstly, this effect could be tolerated as an unaccounted source of error in the OCS estimates. Secondly, additional layers of H 2 O could be included in the state vector and jointly retrieved with an updated prior constraint. Thirdly, these associated H 2 O spectral features could be treated as effective noise within the measurement covariance, thus decreasing the sensitivity of the retrieval to variations in the water vapour vertical profile.
The first option is undesirable because there is clearly evidence supporting further treatment of H 2 O. At first attempt, three layers of H 2 O were included in the state vector with a new prior covariance derived from the 80 atmosphere ensemble. However, it was found that this formulation did not converge unless a much stronger prior constraint was constructed. Therefore, these 5 spectral variations for H 2 O were instead treated as noise by creating a vector of scaling factors that increased the variances of the measurement covariance accordingly. This was accomplished by taking the ratio of the variance of the residual spectra over the square of the IASI instrument noise and setting any values less than one to unity. Thus, making the retrieval less sensitive to unretrieved layers of water vapour. All estimates of OCS from this point further include scaled variances in the measurement covariances due to water vapour variability.

Channel selection
Spectral channels in remote sensing tend to be highly correlated, not only by the gas specific rotational-vibrational energy transitions, but through other physical effects such as temperature and pressure. In other words, each channel does not normally add independent information and contains a certain amount of redundancy. In theory, adding more channels to the estimate always increases the total information content to varying degrees. In practice, there are spectral channels that contain more 15 information than others such that adding channels of negligible importance does little to improve figures of merit (like DFS and posterior uncertainty), but increases sensitivity to unaccounted physical parameter errors. One method to improve the robustness of a retrieval by reducing sensitivity to unaccounted parameters is to select a subset of spectral channels that contains the majority of information while excluding the remaining channels that negligibly contribute.
Channel selection was performed over the 2000 − 2300 cm −1 range in order to remove these spectral channels of little importance. One option is to remove channels while maximizing a figure of merit for the joint retrieval as a whole. Another is to maximize just the OCS portion of the retrieval at the expense of the other retrieved parameters. Since the other states are 5 included just to improve the OCS estimates, the latter option is chosen here.
OCS is so weakly constrained that attempting to maximise the DFS is not appropriate in this instance. In the unconstrained case, the DFS is not defined for maximum likelihood estimates. However, it is always desirable to minimize the posterior uncertainty, whether constrained or not. In this case, just the uncertainty component of OCS is considered: 10 whereŜ x is defined in Eq.
(3) and the subscript index denotes the first diagonal element corresponding to OCS.
The selection begins by first finding the best two spectral channels that minimiseσ 2 OCS after calculating all possible two channel combinations. Then a third channel is selected by adding all remaining channels individually and choosing the one which reducedσ 2 OCS the most. This process is repeated until all spectral channels have been ranked according to their contribution towards minimizing the posterior uncertainty of OCS. The resulting channel ranking for a mid-latitude atmosphere is 15 visualised in Fig. 7 where the best two channels estimate OCS uncertainty to be nearly 50% while including all 1201 channels reduce the uncertainty to just over 10%. Notice that the first 20 channels reduce uncertainty by a factor of two from the initial pair, but it takes the remaining channels to gain another factor of two reduction. For this retrieval the top 100 channels were retained, which yield an uncertainty of just 12% (versus 10%) for this particular atmosphere with 12 times fewer channels.
The resulting selected channels are shown in Fig. 8 for reference. Channels are selected from this method covering the 20 entire spectral range, rather than just the 40 cm −1 OCS interval, because these outside channels contribute to the other 10 parameters jointly estimated that help improve the OCS retrieval. Channels are only selected in so far as they contribute to better OCS estimates. The CO 2 Q-branch at 2077 cm −1 was avoided by heavily penalizing these channels within the measurement covariance prior to running the selection. Notice that the selected channels largely avoid the majority of H 2 O absorption features and frequently select the between band channels associated with water vapour continuum.

Selecting the initial atmosphere
The validity of the linear retrieval is contingent upon the choice of initial atmosphere. The initialisation point should be sufficiently close enough to the observed atmosphere that a single step places the estimate within the uncertainty level of the true state being observed. Failure to do so results in retrieval error due to the nonlinearity of the formulated problem. So how should an initial atmosphere be selected in order to minimize the nonlinearity error? Three possible techniques are analysed 30 for determining the initial atmosphere that do not require rerunning the forward model.
1. Select the initial atmosphere whose model spectrum minimizes the spectral cost function in Eq. (8), as previously discussed. This method essentially picks the atmosphere whose mean spectrum is closest to the IASI observation for the selected spectral channels. The diagonal may be used to approximate S to speed up the process of running through the 80 atmospheres for each pixel since the selected channels contain few adjacent pairs.
2. Another method is to estimate what the model spectrum would be after the retrieval, within the linear framework of the problem, and then select the atmosphere which minimizes the projected spectral cost. The retrieved state can be linearly projected back into spectral space to estimate the posterior spectrum, Ifŷ is substituted for F (x) in Eq. (8) andx is expanded using Eq.
(2), then the resulting projected cost is given by In some texts KG is referred to as the Data Resolution Matrix (DRM) and, unlike GK, is generally not equal to the identity matrix in the unconstrained least-squares retrieval.  ensemble are calculated to yield two matrices; an array of initial spectral differences (∆BBT of size 6320 × m) and a vector of corresponding linearly retrieved OCS errors (δOCS of size 6320 × 1). The goal is to determine a prediction vector (a of size m × 1) that approximates the following equation: However, since there are only 80 independent atmospheres considered, the dimensionality of the problem must be re-5 duced if Eq. (12) is to be successfully inverted to find a. Therefore, ∆BBT is decomposed into singular vector components, ∆BBT = UΛV T , where U and V are the left and right singular vectors, respectively, and Λ is a diagonal matrix of its singular values. The inner dimensions of U and V T are then ranked in order of decreasing singular values and truncated at 79. Equation (12) is then recast as 10 where the truncated least-squares solution to a is calculated. Finally, the prediction vector is found to be a = V a .
These three initial atmosphere selection methods are compared using the RTTOV ensemble in the absence of instrument noise and contaminating parameters so that the error due solely to non-linearity is assessed. Each atmosphere of the 80 is used as a test case where the objective is to select an initial atmosphere from the remaining 79 which minimizes the error in the estimate while knowing the true OCS model value. In the ensemble all atmospheres contain the same OCS profile, because of 15 Figure 9. Histograms of the OCS retrieval error due to non-linearities are shown for four methods of selecting the initial atmosphere, x0.
The first enumerated method in this section is labeled 'mean spectrum', the second 'projected cost', and the third 'trained error prediction'.
the lack of information about its distribution and variability. The model OCS profile is 590 ppt at the surface, constant up to the tropopause, and steadily decreasing with altitude through the stratosphere. Therefore, the best initial atmosphere yields a retrieval step closest to zero, because OCS is a flat field throughout the model atmospheres. Figure 9 shows histograms of the linear assumption error for the three selection methods discussed. Furthermore, a method of randomly selecting the initial atmosphere was also analysed to provide a baseline for comparison. Evidently, the method 5 that most frequently yielded retrieval errors near zero was from selecting the atmosphere that minimized the projected cost.
This was followed by matching the mean spectrum (minimizing the initial spectral cost). Predicting the retrieval error worked in the sense that it outperformed the baseline of random selection, but provides larger error than the other two methods.
One may conclude that the initial atmosphere should be selected based upon minimizing the linearly projected cost. However, this was attempted with real data and in practice it became clear that a grossly non-linear starting point, such as using an initial 10 polar atmosphere for an observation in the tropics, may occasionally be projected to outperform all other atmospheres. This is because linear analysis is only valid in the nearly linear to moderately linear regimes. Therefore, the method of selecting an initial atmosphere by minimizing the difference in the mean spectra, Eq. (8), was used in the work presented here because it avoids this particular problem.
The expected non-linearity error is given by the width of the distributions in Fig. 9 and the mean spectrum method was 15 found to have an error of 11% on average. This analysis was also performed using all spectral channels, i.e., without channel selection, and found to yield an average error of 19% for this method. Thus, channel selection is crucial towards improving the OCS retrieval, because it makes the problem almost twice as linear.

Geographical considerations
Lower most tropospheric pressure is influential in the OCS retrieval not just through the direct effect of pressure broadening the spectral features near the surface, but also because of pressure dependent water vapour continuum effects in the lower 5 troposphere that overlap with all OCS spectral lines. Surface pressure variations due to geographical altitude must, therefore, be accounted for in some way. If not, then the Jacobians and initial column amounts will misrepresent the observation, especially over mountain ranges and high plateaus.
To do this, separate atmospheric ensembles of model spectra, gain matrices, and initial values were created for surface pressure scenarios of 1030, 900, 800, and 700 hPa. Average surface pressure was tabulated from ECMWF reanalysis data and 10 stored as a reference field. Prior to computing the linear estimates of OCS, the surface pressure for each IASI pixel based on its latitude and longitude is interpolated from the saved map. Then the appropriate ensemble is selected based upon whether the interpolated surface pressure falls outside or within the bounds of 950, 850, or 750 hPa. Applying this method noticeably removed any high terrain artefacts that systematically appeared in the OCS estimates.
Additionally, temperature contrast between the ground surface and lowest atmospheric layer affects the sensitivity of the OCS 15 estimates, as shown in Fig. 2. Thermal contrast is a particular problem over land, and especially deserts in the summer, where the surface is heated by solar absorption to values occasionally greater than 15 K above the atmospheric surface temperature.
In the deep Antarctic, there can be a negative thermal contrast where the surface is actually colder than the atmosphere and absorption lines switch to emission features. This effect is far less important over the oceans, because the heat capacity of water is so great that thermal contrast tends to be slightly positive with less variability.

20
Therefore, the method employed in this work is to treat IASI observations over ocean as having a routine thermal contrast of +3 K, while allowing for greater variation over land. Instead of selecting from 80 atmospheres over land with one thermal contrast option, the ensemble is grown to include scenarios of -5, 3, 10, and 15 K of thermal contrast. So an observation over land has 320 possible atmospheric initialization points to select from. As previously mentioned, the model atmospheric spectrum that most closely matches the observed spectrum determines which atmosphere is selected as the initial point.

Quality filtering
In an iterative retrieval, high confidence in the estimate is obtained by verifying that the retrieval converged on a minimum χ 2 value. This may not be the correct minimum, but the fact that a minimum was found suggests that the framework of the problem is behaving in a consistent way. In a one-step linear retrieval the forward model is not recalculated for each individual pixel in order to save computation time. Incidentally, other metrics of quality must be evaluated in order to identify and exclude 30 retrievals that have likely gone awry. The steps to filter the OCS estimates for quality are described in detail.
First, any IASI pixels with an AVHRR cloud fraction of 20% or greater are excluded from consideration prior to computing the retrieval. The presence of cloud introduces highly non-linear behaviour that must be modelled properly if the OCS estimates are to be trusted. This AVHRR cloud fraction product is not perfect and routinely flags sea ice as cloud. However, the vast majority of the time it provides a robust and accurate estimate of the amount of cloud filling the IASI pixel. Therefore, cloudy scenes are simply avoided in favour of clear sky observations.
Next, viewing angles noticeably affected by sun glint are excluded from the retrieval by calculating the specular solar reflection angle based upon the solar and satellite zenith and azimuth angles and removing pixels where this angle is less 5 than 18 • . Additionally, there is a slight overestimation of OCS when observing towards the limb. Rather than attempting to parametrise or mitigate this effect, observations with an air mass factor relative to nadir greater than 1.47 are avoided. This removes the very far edges of the IASI scan where the surface zenith angle is greater than 47 • . For surface zenith angles less than this value, limb effects were not noticeable. Fortunately, the overlap of IASI-A with IASI-B is greater than this angular width, so no spatial gaps in coverage are introduced as a result.
Since the retrieval jointly estimates other physical parameters in conjunction with OCS, there is further opportunity for common sense filtering for quality. For example, if the retrieved surface temperature falls outside of the range between 230 − 340 K, then that pixel is removed from consideration. Furthermore, if the lowest level of retrieved water vapour has a VMR greater than 4%, then the observation is clearly not represented properly and those OCS estimates are excluded.
Finally, the projected spectral cost from Eq. (11) can be used as a retrieval diagnostic given the fact that the atmosphere 15 with the smallest initial spectral cost was selected as the initialization point. The expectation value of the projected cost should be approximately equal to the number of spectral channels (m = 100) if the retrieval were ideally linear. Since the problem is not linear, the average projected cost will certainly be greater than m. However, the magnitude of the projected cost provides a useful prediction as to how well the retrieval may perform. Thus the normalized criteria for accepting a retrieved pixel is given by Aside from filtering against cloudy scenes, this provides the strictest quality test of those mentioned and highlights geographical areas that are poorly represented by the modelled atmospheric ensemble.

OCS results from 2014
The entirety of IASI-A and B data from 2014 (19.4 Tbytes) was downloaded and processed in this study using the previously taking the median rather than the mean. The median also dampens the effect anomalous cases have upon the statistics of the distribution, whereas one bad pixel resulting in a wildly high or low total column amount could artificially dominate the mean.
The number of pixels per bin passing the quality and cloud free criteria is also shown for reference. Only bins containing three or more observations are shown and any areas with two or less observations are considered missing and coloured grey.
Areas that are systematically low in number of observations are either routinely flagged as cloudy or routinely predicted via the projected cost to poorly model the observation. Notice that areas of sea ice towards the poles are consistently absent, likely due to AVHRR cloud flagging. However, persistent glaciers over land contain many more observations and do not experience this

10
This gives an estimate of the width of the OCS distribution based upon the sampling of retrieved values. In an iterative retrieval, the posterior uncertainty fromŜ x is normally used to represent the error of the retrieval. However, within the linear framework of this method the posterior uncertainty derived from the initial guess will systematically underestimate the true error of the OCS retrieval. Thus, the sample standard deviation provides a metric that is a combination in quadrature of retrieval noise, natural OCS variability, and errors due to unaccounted parameters. At a minimum, the sample standard deviation of OCS 15 will be no less than the retrieval noise, assuming there is a sufficient number of samples. Areas that are clearly dominated by retrieval noise are the Antarctic plateaus, Greenland, and high latitude land in the Northern Hemisphere during winter.

Estimates over ocean
Beginning with the oceans, there is a clear correspondence of OCS estimates observed between day and night. Prior to filtering based on the solar reflection angle, it was apparent that sun glint was an issue for estimates over water, especially near the 20 equator. However, by excluding observations along the specular path this issue was mitigated such that the day and night estimates resemble each other. This is the expected result because variations in thermal contrast from the day to night over water should be fairly small. Therefore, OCS should be equally detectable over water regardless of the time of day.
OCS estimates throughout the year show that there is a consistent feature of elevated OCS in the South Pacific off the coast of South America between 0 and −30 • latitude (Fig. 16, point 1) that matches well to the direct OCS emissions modelled in 25 Launois et al. (2015b). While there is some variation throughout the year, this particular feature remains relatively constant regardless of season. In contrast, further South there is a large OCS signal that appears to align with the Antarctic Circumpolar Current (ACC) in both day and night observations (Fig. 16, point 2). This particular feature shows a large seasonal variation with maxima occurring during southern hemisphere summer and minima during winter when incident solar radiance is low.
This is the seasonal cycle one would predict if the primary source of OCS were photochemically reduced CDOM. Before season for photochemical production. OCS features that particularly stand out in these areas are the tropical enhancement during May to June coming off the coast of Baja California (Fig. 12, point 3) and the high latitude structures south of Greenland and north-east of Iceland (Fig. 12, point 4) in this same time period. Additionally, there appears to be a consistent enhancement of OCS in the northern Indian Ocean by the Saudi Arabian Peninsula, which also resembles the model in Launois et al. (2015b).
Interestingly, there is an OCS feature over the Pacific Ocean between Japan and Alaska (Fig. 11, point 5) that is in phase, 5 but one month delayed, with the high OCS signal over the east of China and the Tibetan Plateau. This ocean feature begins in January and February, reaches maximum in March and April, and then dissipates by August. Whereas the OCS land signal over China grows substantially in November and December and then is closer to background levels in May and June. One possibility is that the enhancement over the ocean between Japan and Alaska is an OCS plume originating from China transported by the easterly zonal winds that dissipates when OH concentrations increase during spring and summer. On the other hand, the two 10 signals may be purely coincidental and indicative of two unrelated sources of OCS or other atmospheric characteristics that produce artificially high estimates over these regions.

Estimates over land
Satellite retrievals over land are subject to a greater number of surface type variations than over ocean. As a result, there are more variants contributing to the signal that may require a modelled response; such as emissivity, altitude, surface facets, 15 reflectance distribution functions, and snow cover. Therefore, one must analyse spatially sharp OCS gradients over land coinciding with geographical features and overly distinct land-sea boundaries with a certain amount of scepticism.
Recent work by Glatthor et al. (2015) and Berry et al. (2013) have shown that there should be a noticeable depletion of OCS over the Amazon and Congo rainforest areas due to strong vegetative uptake. This is indeed what is observed in these data, especially for the observations made during the day (Fig. 16, point 6). The Amazon and Congo areas show OCS total 20 columns approximately 10-20% less than what is estimated over nearby oceans at the same latitude. Therefore, these results are consistent with the idea that vegetative uptake is a significant sink of OCS. However, notice that the night-time estimates (Fig. 16,point 7) tend to be slightly greater than the day-time, which may be indicative of a physical OCS process with a diurnal signal. It is also possible that this effect is an artefact of the retrieval. One may quickly blame thermal contrast between day and night observations; except that it is the wrong way around from what is expected. For example, areas over desert like the 25 Sahara and much of Australia show low OCS at night and higher OCS during this day. This is because solar heating increases thermal contrast, which makes trace gases more detectable. During the night, these low humidity areas quickly radiate away their heat and come closer to thermal equilibrium between the surface and the lower atmosphere, thus decreasing sensitivity to OCS. Therefore, if the higher night-time OCS signal over the rainforests is not physical, then it is unlikely to be solely due to thermal contrast.

30
Along this same vein, notice that the high latitude areas over land near the Arctic (Fig. 10 and removing pixels where the model initial atmospheres are predicted to poorly represent the scene. Therefore, it may also be possible that these signals are real and there are large sources of OCS creating local enhancements. If this signal represents physical OCS amounts, then the source is more likely to be anthropogenic in nature given that the detail closely follows geographical boundaries of human population. Finally, the areas of high OCS signal over China and the former Soviet republics east of the Caspian Sea especially stand 15 out in displayed estimates. These are also areas of known SO 2 emissions due to industrial processes and energy production that are routinely modelled in chemical transport models, such as TOMCAT (Spracklen et al., 2005). While it is energetically unfavourable for SO 2 to convert to OCS, the two may be positively correlated in many physical situations, especially in anthropogenic processes that do not have strict methods in place to reduce SO 2 emissions.

Comparisons to NOAA flask samples 20
Total column estimates of OCS from the linear retrieval were also compared to VMR flask measurements of OCS collected by NOAA (Montzka et al., 2007). while higher values fail to reject the null hypothesis. As a rule of thumb, p-values less than 0.05 are generally regarded as statistically significant (Hung et al., 1997).
Comparing pressure specific VMR to total column amount can be tenuous if the true shape of the vertical profile differs greatly from the referenced profile. Furthermore, the flask samples are not exactly coincident with the IASI observations in space and time, so this combines to introduce a certain level of natural error that is difficult to isolate and quantify. However, by 5 analysing on a monthly basis, these effects may be mitigated where the desired outcome is to show correlation and consistency between the seasonal signals of the two. Finally, the NOAA site located in American Samoa (SMO) actually shows a negative correlation between flask samples and 25 IASI estimates. This is entirely due to the first two months of the year, January and February, while the remainder of the year shows a positive correlation. This early year depletion in the total column estimates can be visualized in Fig. 10. Notice that there is a spatial low in OCS total column that extends from the Indonesian islands well into the middle-south Pacific during this time of year. Since this is peak season for photosynthesis in the Southern Hemisphere, it is possible that American Samoa is downwind of Indonesia and northern Australia, strong OCS sinks for January and February, while the ocean surface near 30 American Samoa is emitting OCS or its precursor gases. On the other hand, it is possible this feature is an artefact of some unsensed physical parameter or a weather effect yielding a nonlinear error biased consistently low.
A novel linear retrieval method was developed and applied towards making timely estimates of OCS total columns for the entirety of IASI observations from 2014. There are two components that make this retrieval scheme unique in comparison to current linear methods. First, physical parameters that influence the spectral observations over the wavenumber range used for OCS are directly accounted for by jointly retrieving them along with OCS. This differs from previous methods in that they tend 5 to use an effective measurement covariance that treats the physical parameters not directly retrieved as noise. Second, an initial linearisation point is selected from a global ensemble of atmospheres based on minimizing the spectral difference between the IASI and the modelled spectral radiances. This step is intended to make the retrieval more linear, thus reducing the need for iterative steps that rerun the forward model several times per pixel.
Additionally, an iterative retrieval for OCS was used as a test-bed to develop and validate the framework of the retrieval; 10 i.e., the state vector, prior constraints, and initial atmosphere selection. Once this was accomplished, an ensemble of IASI observations over the Pacific Ocean was used to quantify the mean spectral residual for the converged estimates and showed that the majority of spectral channels match to within instrument noise, except the stronger water absorption features. Water vapour channels were then treated as noise by modifying the measurement covariance diagonals accordingly based on the mean spectral residual. Finally, channel selection was performed based on the OCS posterior uncertainty, reducing the number of 15 channels from 1201 to 100, which ultimately made the OCS retrieval almost twice as linear.
The OCS estimates visualized in two month increments display many interesting features consistent with prior knowledge of its sources and sinks. For example, the day-time total columns show depletions in the OCS signal over tropical rainforests, which is consistent with the idea that vegetation is the strongest sink of OCS. The Pacific Ocean displays spatial features of elevated OCS that vary seasonally and appear to match the prediction made by Berry et al. (2013) that there is a large 20 source in the Pacific Ocean, especially in the southern hemisphere. Interestingly, there is a clear band of high OCS estimates following the circumpolar current north of Antarctica, which is well known for consistent upwelling sustained by turbulent gyres. Additionally, regions of land showing high OCS estimates were found over China, the area east of the Caspian Sea, and northern coastal Africa leading to the Middle East. It is possible these land regions are emitting anthropogenic OCS or that there is some surface property unaccounted for that consistently leads to elevated estimates.

25
To validate the linear retrieval on a monthly basis, these OCS results were compared to surface VMR samples collected via flask by NOAA stations across the globe. It was found that five (three northern and two southern hemisphere) NOAA sites out of seven had seasonal cycle correlation coefficients greater than 0.7. Further comparisons to aircraft campaigns and zenith-viewing surface estimates of OCS may be attempted in the future.
In the absence of a large computational cluster, iteratively analysing forward models of radiative transfer may still be too 30 time consuming to evaluate IASI data beyond individual and area specific events. In this case, one can reduce the accuracy of the retrieval by treating the problem within the linear framework presented in this paper while speeding up the computational process by a factor of roughly 10 4 (depending upon the specific retrieval). Analysis of model scenarios suggests that the error due to ignoring non-linearities is about 11% globally for OCS. Since these linear estimates can be generated so rapidly, it is 8 Figure  The shaded red area represents the sample standard deviation of the total column estimates and the black error bars are the standard deviation of the flask samples within that month (not divided by the square root of the sample size).