Iterative retrievals of trace gases, such as carbonyl sulfide (OCS), from
satellites can be exceedingly slow. The algorithm may even fail to keep pace
with data acquisition such that analysis is limited to local events of
special interest and short time spans. With this in mind, a linear retrieval
scheme was developed to estimate total column amounts of OCS at a rate
roughly

Retrieving atmospheric trace gas concentrations from infrared satellite
observations can be an expensive process, especially when implementing an
inverse method such as optimal estimation

This paper presents a new method for rapidly retrieving trace gas abundances
as applied towards estimating total vertical column amounts of carbonyl
sulfide (OCS). The proposed method is linear in the sense that an estimate
for each pixel is made only once, thus bypassing the iterative steps. By
precalculating the RTM, the retrieval operates roughly

The approach presented here differs from previous work on fast linear
retrievals in two ways: first, an initialization point is selected from an
ensemble of atmospheres based upon how closely the corresponding model
spectrum matches the observed spectrum. Previous work generally uses a global
or a wide region mean atmosphere as the initial guess. By selecting from an
atmospheric ensemble, the problem becomes more linear and reduces the
non-linear error introduced by failing to iterate towards a converged
solution. Second, all physical parameters affecting the spectral signal above
instrument noise are jointly estimated to account for their influence upon
the desired quantity (OCS total columns in this case). One popular
alternative, as first described by

Atmospheric OCS estimates from IASI observations throughout 2014 are used as a case study for this new rapid retrieval method because OCS is an important trace gas for understanding the global sulfur cycle, is currently poorly modelled, and is at the edge of detectability with nadir-viewing instruments like IASI. While OCS is studied here, the proposed retrieval method can potentially be used for any detectable trace gas. Aside from introducing a novel retrieval method, this paper also shows unprecedented seasonal OCS results from a nadir-viewing hyperspectral instrument.

Carbonyl sulfide is a molecular reservoir species for atmospheric sulfur.
OCS is the longest lived and most abundant sulfur-containing gas in the
unpolluted atmosphere

Yearly OCS trends are approximately constant according to numerous NOAA
sample stations across the globe

The majority of OCS originates from ocean sources, either by direct emission
or secondary production from short-lived oceanic

The remaining sources of OCS are largely anthropogenic with a small
contribution from anoxic soils, such as marshes and wetlands. Industrial
production of rayon and cellophane are known to emit

The vast majority (over 80 %) of OCS is removed from the atmosphere in
conjunction with photosynthesis, either from vegetative canopy or microscopic
organisms in oxic soils, e.g.

Aside from the Atmospheric Trace Molecule Spectroscopy (ATMOS) experiment
using manned space flight, OCS was first observed from satellite by the
Interferometric Monitor for Greenhouse Gases (IMG)

In retrospective analysis,

Most recently,

This retrieval first estimates a vertical profile of OCS on many vertical
levels and then averages the levels between

A monthly mean of TES OCS results from June 2006 was published in

This section methodically discusses the mathematical framework, formulation, and parameter validation of the retrieval scheme applied to OCS. Caution is advised to not overly compare the presented method to a standard optimal estimation routine based upon iterating a time-consuming forward model. The intent of this method is to rapidly estimate OCS in a single step with minimal dependence upon prior assumptions. Retrieval error due to avoiding the residual non-linearities is statistically quantified for reference.

A forward model (

Solutions to Eq. (

When the probability density function of the atmospheric state is symmetric
about the expected value, the posterior covariance (i.e. the estimated
covariance of

Further diagnostic information about the retrieval is succinctly contained in
a unitless

Repeated analysis of

With these relations at hand, the proposed retrieval is a direct application
of Eq. (

Brightness temperature spectra were intentionally used instead of radiance
spectra because removing curvature from the Planck function improves the
linearity of the retrieval

Identifying OCS spectral features is a straightforward process. Figure

Top shows a simulated IASI BBT spectrum from a desert (i.e. low humidity) atmosphere covering
the spectral range used in the linear retrieval. Middle and bottom show Jacobian spectra
showing the change in BBT for a 1 % increase in volume mixing ratio (VMR) for the
gases listed. The CO

The spectral range included in this retrieval is much larger than the OCS
spectral band, which runs from 2040 to 2080

The spectral characteristics of the observation and the applied constraints
determine the vertical sensitivity of the retrieval. The weighting functions,
i.e. the Jacobian values from perturbing each individual vertical layer, for
OCS at the strongest spectral point (2071.25

Weighting functions for IASI simulations of OCS are shown from
vertical layers

Even though OCS is the desired target, the intent of the joint retrieval is
to simultaneously account for all physical parameters that affect the
observed spectrum above the noise level. Mathematically, this is handled with
the cross terms in Eq. (

With this in mind, the state vector is chosen to be

The vertical representations of the temperature and water vapour Jacobians are shown. These represent triangular perturbations as opposed to rectangular (evenly weighted) vertical perturbations.

The ratio of CO

As shown in Fig.

Since the statistical distributions of temperature and water vapour vertical
profiles are well known, the resulting estimates can be constrained to
scenarios found on Earth where clearly unphysical profiles are excluded with
a negligible loss of sensitivity. Furthermore, since atmospheric temperature
and water vapour are physical correlated, it is possible to represent this
effect in the prior covariance. Thus, the 80-atmosphere ensemble was
vertically binned down to the bulk layers of the retrieval and was used to
calculate the sample covariance matrix, which includes the cross-state
physical correlation terms. The subsequent correlation matrix is shown in
Fig.

The correlation matrix is shown for the sample covariance of the H

As a caveat, all elements of the state vector, including OCS, are technically
constrained with finite values in the diagonal of the prior covariance. This
is primarily for the purposes of developing a test-bed iterative retrieval
that utilizes the Levenberg–Marquardt method, which will be discussed next.
OCS variability in the prior covariance is assigned to be 200 %. CO and O

Validation of the retrieval framework, as previously defined, is crucial
towards developing confidence in the resulting estimates. Without analysing
external data, using an iterative retrieval one can show that

the estimates converge during iteration

the OCS spectral signature is noticeable in the residual spectrum of the converged result when excluded from the state vector and all other parameters are retrieved

the variability of the converged residual spectrum over many pixels is similar to the expected instrument noise.

The iterative retrieval was written as a test bed for the faster linear
scheme; therefore, the spectral range, state vector, and prior covariance are the
same as previously defined. This non-linear approach is based on the
Levenberg–Marquardt method as discussed in

OCS signatures can be shown in the converged residual spectrum (IASI minus
RFM) if all other contributing parameters are retrieved. This is done by
removing OCS from the state vector while retrieving the other 10 in its
absence. Figure

The residual spectrum between the IASI
observation and the converged estimate from the RFM without modelling OCS is
shown as the black line. The red line depicts the instrument noise level
specific to the observed surface temperature of

Once all physical parameters that contribute to the signal above the noise
level are accounted for through the joint retrieval, then the standard
deviation of the spectral difference between the observation and the model,
i.e. the residual, should be equal to the instrument noise. If this is not
the case, then any parameters that are not completely accounted for will show
an associated spectral feature in the standard deviation of the residual
spectra. To test this posit, the iterative retrieval was run over 600 pixels
in a

There are three options to pursue with regards to unresolved but
influential H

The first option is undesirable because there is clearly evidence supporting
further treatment of H

Black shows the sample standard
deviation of the residual spectra between the IASI measurements and the
converged model spectra for an ensemble of 600 pixels from the tropical
South Pacific Ocean. Red shows the average instrument noise (

Spectral channels in remote sensing tend to be highly correlated, not only by the gas-specific rotational–vibrational energy transitions, but through other physical effects such as temperature and pressure. In other words, each channel does not normally add independent information and contains a certain amount of redundancy. In theory, adding more channels to the estimate always increases the total information content to varying degrees. In practice, there are spectral channels that contain more information than others such that adding channels of negligible importance does little to improve figures of merit (like DFS and posterior uncertainty), but this increases sensitivity to unaccounted physical parameter errors. One method to improve the robustness of a retrieval by reducing sensitivity to unaccounted parameters is to select a subset of spectral channels that contains the majority of information, while excluding the remaining channels that negligibly contribute.

Channel selection was performed over the 2000–2300

OCS is so weakly constrained that attempting to maximize the DFS is not
appropriate in this instance. In the unconstrained case, the DFS is not
defined for maximum likelihood estimates. However, it is always desirable to
minimize the posterior uncertainty, whether constrained or not. In this case,
just the uncertainty component of OCS is considered:

The selection begins by first finding the best two spectral channels that
minimize

Ranked spectral channels are shown for a mid-latitude atmosphere based on their contribution towards minimizing the posterior uncertainty of OCS. The asymptote from including all 1201 channels is shown as the dotted red line.

The resulting selected channels are shown in
Fig.

The top
100 spectral points (red circles) ranked in Fig.

The validity of the linear retrieval is contingent upon the choice of initial atmosphere. The initialization point should be sufficiently close enough to the observed atmosphere that a single step places the estimate within the uncertainty level of the true state being observed. Failure to do so results in retrieval error due to the non-linearity of the formulated problem. So how should an initial atmosphere be selected in order to minimize the non-linearity error? Three possible techniques are analysed for determining the initial atmosphere that do not require rerunning the forward model.

Select the initial atmosphere whose model spectrum minimizes the spectral cost
function in Eq. (

Another method is to estimate what the model spectrum would be after the retrieval, within the linear framework of the problem, and then select the atmosphere that minimizes the projected spectral cost. The retrieved state can be linearly projected back into spectral space to estimate the posterior spectrum,

If

and

It is important to note that

Finally, the third method considered is to train a vector operator to predict the non-linear
error in OCS based upon the spectral difference between the initial model and
measurement spectra. To do this, all possible permutations
(

However, since there are only 80 independent atmospheres considered,
Eq. (

where the truncated least squares solution to

A fourth possible method would be to select an initial atmosphere based on the time of year and proximity to the observed pixel location. However, the RTTOV ensemble is not well suited for this particular selection method since the atmospheres were chosen to maintain statistical properties of a much larger ensemble and are therefore irregularly spaced in location and season. A separate ensemble of atmospheres parsed in regularly spaced latitude and longitude grids at monthly increments would be more appropriate. Therefore, this study excludes this fourth possible selection method.

The three listed initial atmosphere selection methods are compared using the
RTTOV ensemble in the absence of instrument noise and contaminating
parameters so that the error due solely to non-linearity is assessed. Each
atmosphere of the 80 is used as a test case where the objective is to select
an initial atmosphere from the remaining 79, which minimizes the error in the
estimate while knowing the true OCS model value. In the ensemble, all
atmospheres contain the same OCS profile because of the lack of information
about its distribution and variability. The model OCS profile is

Figure

Histograms of the OCS retrieval error due to
non-linearities are shown for four methods of selecting the initial atmosphere,

One may conclude that the initial atmosphere should be selected based upon
minimizing the linearly projected cost. However, this was attempted with real
data and in practice it became clear that a grossly non-linear starting
point, such as using an initial polar atmosphere for an observation in the
tropics, may occasionally be projected to outperform all other atmospheres.
This is because linear analysis is only valid in the nearly linear to
moderately linear regimes. Therefore, the method of selecting an initial
atmosphere by minimizing the difference in the mean spectra,
Eq. (

The expected non-linearity error is given by the width of the distributions
in Fig.

Lowermost tropospheric pressure is influential in the OCS retrieval not just through the direct effect of pressure broadening the spectral features near the surface, but also because of pressure-dependent water vapour continuum effects in the lower troposphere that overlap with all OCS spectral lines. Surface pressure variations due to geographical altitude must therefore be accounted for in some way. If not, then the Jacobians and initial column amounts will misrepresent the observation, especially over mountain ranges and high plateaus.

To do this, separate atmospheric ensembles of model spectra, gain matrices,
and initial values were created for surface pressure scenarios of 1030, 900,
800, and

Additionally, temperature contrast between the ground surface and lowest
atmospheric layer affects the sensitivity of the OCS estimates, as shown in
Fig.

Therefore, the method employed in this work is to treat IASI observations
over ocean as having a routine thermal contrast of

In an iterative retrieval, high confidence in the estimate is obtained by
verifying that the retrieval converged on a minimum

First, any IASI pixels with an AVHRR cloud fraction of 20 % or greater are excluded from consideration prior to computing the retrieval. The presence of cloud introduces highly non-linear behaviour that must be modelled properly if the OCS estimates are to be trusted. This AVHRR cloud fraction product is not perfect and routinely flags sea ice as cloud. However, the vast majority of the time it provides a robust and accurate estimate of the amount of cloud filling the IASI pixel. Therefore, cloudy scenes are simply avoided in favour of clear sky observations.

Next, viewing angles noticeably affected by sun glint are excluded from the
retrieval by calculating the specular solar reflection angle based upon the
solar and satellite zenith and azimuth angles

Since the retrieval jointly estimates other physical parameters in
conjunction with OCS, there is further opportunity for common sense filtering
for quality. For example, if the retrieved surface temperature falls outside
of the range between 230 and 340

Finally, the projected spectral cost from Eq. (

All of the IASI-A and B data from 2014 (

The number of pixels per bin passing the quality and cloud free criteria is also shown for reference. Only bins containing three or more observations are shown and any areas with two or less observations are considered missing and are coloured grey. Areas that are systematically low in number of observations are either routinely flagged as cloudy or routinely predicted via the projected cost to poorly model the observation. Notice that areas of sea ice towards the poles are consistently absent, which is due to AVHRR cloud flagging. However, persistent glaciers over land contain many more observations and do not experience this false-positive cloud-flagging effect. Alternatively, desert areas during the daytime in local summer are frequently cloud free and marked as such, but they routinely fail the quality check and contain few estimates. This signifies that the model atmospheres in the ensemble fail to closely match summer desert scenarios that are sun illuminated, perhaps because of lower surface emissivity that increases solar and downwelling reflections that are currently not modelled.

The sample standard deviation of OCS per spatial bin over the
2-month period is shown in the bottom row of
Figs.

Beginning with the oceans, there is a clear correspondence between day and night of OCS estimates observed. Prior to filtering based on the solar reflection angle, it was apparent that sun glint was an issue for estimates over water, especially near the equator. However, by excluding observations along the specular path this issue was mitigated such that the day and night estimates resemble each other. This is the expected result because variations in thermal contrast from day to night over water should be fairly small. Therefore, OCS should be equally detectable over water regardless of the time of day.

January–February
2014: linear estimates of total column OCS median values are shown in the top
row for sun-illuminated morning (left column) and night-time evening (right
column). The results are binned by latitude–longitude widths of

OCS estimates throughout the year show that there is a consistent feature of
elevated OCS in the South Pacific off the coast of South America between

Same as in
Fig.

Northern hemispheric ocean areas appear to have maximum OCS signal between
March and June (local spring) with minimum values as the season approaches
winter. Once again this is consistent with how the incident solar radiation
varies with season for photochemical production. OCS features that
particularly stand out in these areas are the tropical enhancement during May
to June coming off the coast of Baja California
(Fig.

Interestingly, there is an OCS feature over the Pacific Ocean between Japan
and Alaska (Fig.

Same as in
Fig.

Same as in
Fig.

Satellite retrievals over land are subject to a greater number of surface type variations than over the ocean. As a result, there are more variants contributing to the signal that may require a modelled response, such as emissivity, altitude, surface facets, reflectance distribution functions, and snow cover. Therefore, one must analyse spatially sharp OCS gradients over land coinciding with geographical features and overly distinct land–sea boundaries with a certain amount of scepticism.

Recent work by

Along this same vein, notice that the high-latitude areas over land near the
Arctic (Fig.

Same as in
Fig.

Additionally, there are several areas over land where there are particularly high OCS signals. Much of the continental United States shows OCS estimates greater than ocean values at similar latitudes. The US OCS signal appears to be at a maximum from March to April and a minimum between July and August, with a slow build up back to March. If these estimates are indicative of the true OCS levels, then the July–August minimum coincides with peak vegetative uptake for regions at this latitude. Sources of OCS in the United States, especially anthropogenic and biomass burning, are currently poorly understood.

Same as in
Fig.

Many regions in the Middle East and the north African Mediterranean coast also show very specific enhancements of OCS estimates. It is possible that there exists a surface emissivity feature in these regions that routinely yields spurious elevated OCS values. However, some of this effect is likely mitigated by the process of calculating the projected cost of the retrieval and removing pixels where the model initial atmospheres are predicted to poorly represent the scene. Therefore, it may also be possible that these signals are real and there are large sources of OCS creating local enhancements. If this signal represents physical OCS amounts, then the source is more likely to be anthropogenic in nature given that the detail closely follows geographical boundaries of human population.

Finally, the areas of high OCS signal over China and the former Soviet
republics east of the Caspian Sea especially stand out in displayed
estimates. These are also areas of known SO

Total column estimates of OCS from the linear retrieval were also compared to
VMR flask measurements of OCS collected by NOAA

Same as in Fig.

Figure

OCS total column
median estimates (red) from the linear retrieval are compared to NOAA flask
measurements of OCS surface VMR (black) binned by 12-month increments
throughout 2014. Retrieval estimates are taken from a

Comparing pressure-specific VMR to total column amount can be tenuous if the true shape of the vertical profile differs greatly from the referenced profile. Furthermore, the flask samples are not exactly coincident with the IASI observations in space and time, thus combining to introduce a certain level of natural error that is difficult to isolate and quantify. However, by analysing on a monthly basis, these effects may be mitigated, where the desired outcome is to show correlation and consistency between the seasonal signals of the two.

Of the seven, the Harvard Forest (HFM) site shows the greatest correlation at

Perhaps the most important comparison is the Mauna Loa Observatory (MLO)
because the air is sampled closer to the peak sensitivity of IASI at an
altitude of

Correlations similar to Mauna Loa are found at Trinidad Head (THD), Cape Grim
Observatory (CGO), and Palmer Station in Antarctica (PSA). However, the site
at Mace Head (MHD) shows a lower correlation of only

Finally, the NOAA site located in American Samoa (SMO) actually shows a
negative correlation between flask samples and IASI estimates. This is
entirely due to the first 2 months of the year, January and February, while
the remainder of the year shows a positive correlation. This early year
depletion in the total column estimates can be visualized in
Fig.

A novel linear retrieval method was developed and applied towards making timely estimates of OCS total columns for the entirety of IASI observations from 2014. There are two components that make this retrieval scheme unique in comparison to current linear methods. First, physical parameters that influence the spectral observations over the wave number range used for OCS are directly accounted for by jointly retrieving them along with OCS. This differs from previous methods in that they tend to use an effective measurement covariance that treats the physical parameters not directly retrieved as noise. Second, an initial linearization point is selected from a global ensemble of atmospheres based on minimizing the spectral difference between the IASI and the modelled spectral radiances. This step is intended to make the retrieval more linear, thus reducing the need for iterative steps that rerun the forward model several times per pixel.

Additionally, an iterative retrieval for OCS was used as a test bed to develop and validate the framework of the retrieval, i.e. the state vector, prior constraints, and initial atmosphere selection. Once this was accomplished, an ensemble of IASI observations over the Pacific Ocean was used to quantify the mean spectral residual for the converged estimates and showed that the majority of spectral channels match to within instrument noise, except for the stronger water absorption features. Water vapour channels were then treated as noise by modifying the measurement covariance diagonals accordingly based on the mean spectral residual. Finally, channel selection was performed based on the OCS posterior uncertainty, reducing the number of channels from 1201 to 100, which ultimately made the OCS retrieval almost twice as linear.

The OCS estimates visualized in 2-month increments display many interesting
features consistent with prior knowledge of its sources and sinks. For
example, the daytime total columns show depletions in the OCS signal over
tropical rainforests, which is consistent with the idea that vegetation is
the strongest sink of OCS. The Pacific Ocean displays spatial features of
elevated OCS that vary seasonally and appear to match the prediction made by

To validate the linear retrieval on a monthly basis, these OCS results were
compared to surface VMR samples collected via flask by NOAA stations across
the globe. It was found that five (three Northern Hemisphere and two Southern
Hemisphere) NOAA sites out of seven had seasonal cycle correlation
coefficients greater than 0.7. Further comparisons to aircraft campaigns and
zenith-viewing surface estimates of OCS are desirable. The HIAPER
Pole-to-Pole Observations (HIPPO) study (

In the absence of a large computational cluster, iteratively analysing
forward models of radiative transfer may still be too time consuming to
evaluate IASI data beyond individual and area-specific events. In this case,
one can reduce the accuracy of the retrieval by treating the problem within
the linear framework presented in this paper while speeding up the
computational process by a factor of roughly

Work presented here paid particular attention to OCS as an interesting test case. However, it is important to note that the linear retrieval method presented, using a multi-element state vector to jointly account for other physical parameters and selecting an initialization point from an atmospheric ensemble, can be applied to any trace gas for any nadir-viewing instrument similar to IASI. While the OCS results require further validation, the OCS spatial fields presented are intriguing and may lead to future understanding of its sources and sinks. Furthermore, this method can potentially provide additional insights for minor trace gases that are, as of yet, poorly quantified.

IASI radiance data can be downloaded from numerous sources, including

The authors declare that they have no conflict of interest.

Portions of this work were funded by the United Kingdom's National Centre for Earth Observation and the United States Air Force. The views expressed in this article are those of the author and do not reflect the official policy or position of the United States Air Force, Department of Defense, or the US Government. Additionally, we thank Steve Montzka of NOAA for permission to use the OCS flask data. Edited by: M. Chipperfield Reviewed by: two anonymous referees