The representation of solar cycle signals in stratospheric ozone . Part I : A comparison of satellite observations

Changes in incoming solar ultraviolet radiation over the 11-year solar cycle affect stratospheric ozone abundances. It is important to quantify the magnitude, structure and seasonality of the associated solar-ozone response (SOR) to understand the impact of the 11-year solar cycle on climate. Part I of this two-part study uses multiple linear regression analysis to extract the SOR in a number of recently updated satellite datasets covering different periods within the epoch 1970 to 5 2013. The annual mean SOR in the updated version 7.0 (v7.0) SAGE II number density dataset (1984-2004) is very consistent with that found in the previous v6.2. In contrast, we find a substantial decrease in the magnitude of the SOR in the tropical upper stratosphere in SAGE II v7.0 mixing ratio dataset compared to the v6.2. This difference is shown to be largely attributable to the change in the independent stratospheric temperature dataset used to convert SAGE II ozone number densities 10 to mixing ratios. Since these temperature records contain substantial uncertainties, we suggest that datasets based on SAGE II number densities are currently most reliable for evaluating the SOR. We further analyse three extended ozone datasets that combine SAGE II v7.0 number density data with more recent GOMOS or OSIRIS measurements. The extended SAGE-OSIRIS dataset (1984-2013) shows a smaller and less statistically significant SOR across much of the tropical upper stratosphere 15 compared to the SAGE II data alone. In contrast, the two SAGE-GOMOS datasets (1984-2011) show SORs that compare better with the original SAGE II data and therefore appear to provide a more reliable estimate of the SOR. We also analyse the SOR in recent SBUVMOD version 8.6 (VN8.6) (1970-2012) and SBUV Merged Cohesive VN8.6 (1978-2012) datasets and compare them to the previous SBUVMOD VN8.0 (1970-2009). Over their full lengths, the three records generally agree 20 in terms of the broad magnitude and structure of the annual mean SOR. The main difference is that SBUVMOD VN8.6 shows a smaller and less significant SOR in the tropical upper stratosphere, and

decrease in the magnitude of the SOR in the tropical upper stratosphere in SAGE II v7.0 mixing ratio dataset compared to the v6.2.This difference is shown to be largely attributable to the change in the independent stratospheric temperature dataset used to convert SAGE II ozone number densities to mixing ratios.Since these temperature records contain substantial uncertainties, we suggest that datasets based on SAGE II number densities are currently most reliable for evaluating the SOR.We further analyse three extended ozone datasets that combine SAGE II v7.0 number density data with more recent GOMOS or OSIRIS measurements.The extended SAGE-OSIRIS dataset (1984-2013) shows a smaller and less statistically significant SOR across much of the tropical upper stratosphere compared to the SAGE II data alone.In contrast, the two SAGE-GOMOS datasets  show SORs that compare better with the original SAGE II data and therefore appear to provide a more reliable estimate of the SOR.We also analyse the SOR in recent SBUVMOD version 8.6 (VN8.6)  and SBUV Merged Cohesive VN8.6 (1978VN8.6 ( -2012) ) datasets and compare them to the previous SBUVMOD VN8.0 (1970-2009).Over their full lengths, the three records generally agree in terms of the broad magnitude and structure of the annual mean SOR.The main difference is that SBUVMOD VN8.6 shows a smaller and less significant SOR in the tropical upper stratosphere, and therefore more closely resembles the SAGE II v7.0 mixing ratio data than does the SBUV Merged Cohesive VN8.6, which has a more continuous SOR of ∼2% in this region.The sparse spatial and temporal sampling of limb satellite measurements prohibits the extraction of sub-annual variations in the SOR from SAGE-based datasets.However, the SBUVMOD VN8.6 dataset suggests substantial month-to-month variations in the SOR, particularly in the winter extratropics, which may be important for the proposed high latitude dynamical response to solar variability.Overall, the results highlight substantial uncertainties in the magnitude and structure of the observed SOR from different satellite records.The implications of these uncertainties for understanding and modelling the effects of solar forcing on climate should be explored.

Introduction
Whilst fractional changes in total solar irradiance (TSI) between the maximum and minimum phases of the approximately 11 year solar cycle are known to be small (<0.1%), there is enhanced fractional variability in the ultraviolet (UV) spectral region (>6%) (e.g.Ermolli et al. (2013)).An increase in UV irradiance impacts stratospheric heating rates, and thus temperatures, through two main mechanisms: (1) enhanced absorption of radiation by ozone, and (2) enhanced production of ozone through the photolysis of oxygen at wavelengths less than 242 nm.Consistent with these mechanisms, past studies using observations, reanalysis data and models have identified an increase in annual mean temperature in the upper stratosphere of up to ∼1.5 K between solar maximum and minimum (e.g.Ramaswamy et al. (2001); Mitchell et al. (2015a); Austin et al. (2008)), and an increase in ozone abundances of a few percent (Soukharev and Hood, 2006;Haigh, 1994).These radiatively driven changes modify the meridional temperature gradients in the upper stratosphere, which can lead to a modulation of planetary wave propagation and breaking, and changes in the strength of the stratospheric polar vortex (e.g.Kuroda and Kodera (2002); Matthes et al. (2004Matthes et al. ( , 2006)); Gray et al. (2010); Ineson et al. (2011)).Such feedback mechanisms can lead to amplified changes in regional surface climate via stratosphere-troposphere dynamical coupling (e.g.Gray et al. (2010)).Constraining the stratospheric response to solar forcing is therefore important for understanding solar-climate coupling and potential sources of decadal variability in the climate system (e.g.Thiéblemont et al. (2015)).
The solar-ozone response (SOR) has been estimated to make a substantial contribution to variations in stratospheric temperatures over the 11-year solar cycle.Gray et al. (2009) used an estimate of the SOR from SAGE II (Stratospheric Aerosol and Gas Experiment II) version 6.2 (v6.2) satellite ozone mixing ratio data and spectral solar irradiance (SSI) variations from Lean (2000) to show that the contribution of the SOR to temperature changes between the maximum and minimum phases of the 11-year solar cycle is around 60% at the tropical stratopause, 30-40% between 40-50 km, and 70-80% between 20-30 km.Shibata and Kodera (2005) conducted similar calculations using esti-mates of the SOR from two atmospheric chemical models and found that the SOR accounted for only around 20-25% of the solar-cycle temperature response near the tropical stratopause.Since the two studies used similar SSI data, this difference must arise from the SOR estimated from SAGE II observations used by Gray et al. (2009) being different from that simulated in the atmospheric chemistry models used by Shibata and Kodera (2005).It is therefore important to evaluate the SOR and its uncertainties in different observational datasets to understand the climate response to solar variability and to provide an independent means for evaluating the performance of atmospheric chemistry models (e.g.Austin et al. (2008); see also Part II).
Whilst past studies have quantified the SOR in observations (e.g.Soukharev and Hood (2006); Randel and Wu (2007); Remsberg and Lingenfelser (2010); Remsberg (2014); Bourassa et al. (2014); Lean (2014)), there are differences in the magnitudes and structures between individual satellite records.It is not clear whether these are due to inter-instrument differences in observational periods and/or differences in instrument resolution, sampling or drifts.There are also apparent differences in the structure and magnitude of the SOR between observations and atmospheric chemistry models (e.g.Haigh (1994); Soukharev and Hood (2006); Austin et al. (2008); Dhomse et al. (2011)).
These issues are compounded by current uncertainties in the characteristics of spectral solar irradiance variability (e.g.Ermolli et al. (2013)), which have implications for constraining the magnitude and structure of the SOR because of its dependence on photochemical processes (Haigh et al., 2010;Dhomse et al., 2015;Ball et al., 2016).These factors present an additional challenge for understanding and evaluating the overall climate response to solar variability, particularly since dynamical feedbacks may amplify the effects of an initially small forcing (e.g.Matthes et al. (2006)).
The aim of this two part study (see also Maycock et al., in prep.) is to evaluate the representation of the SOR and its uncertainties in satellite observations and global models.The present Part I describes the SOR in the latest version 7.0 (v7.0) of the SAGE II dataset and compares it to the former v6.2, which has been used in several solar-climate studies (e.g.Soukharev and Hood (2006); Gray et al. (2009)) and in several ozone databases developed for climate models without interactive chemistry (Cionni et al., 2011;Bodeker et al., 2013).A number of merged satellite ozone datasets, which extend SAGE II using more recent measurements, have also been created and analysed as part of the WCRP/SPARC (World Climate research Programme/Stratospheric-tropospheric Processes and their Role in Climate) SI 2 N ozone trends activity (e.g.Tummon et al. (2015)); we analyse the SOR in three of these combined satellite ozone datasets.We also analyse the SOR in two versions of the recently released VN8.6 of the Solar Backscatter Ultraviolet Instrument (SBUV) data and compare these to the former SBUVMOD VN8.0 data.
Part II of the study (Maycock et al., in prep.)describes the SOR in atmospheric chemistry-climate model simulations from the WCRP/SPARC Chemistry-Climate Model Initiative (CCMI) and compares them to a subset of the observational records discussed here that are determined to be most reliable for diagnosing the SOR (see below).Part II also discusses the representation of the SOR in the climate model ozone dataset created for the fifth Coupled Model Intercomparison Project (CMIP5) (Cionni et al., 2011).This leads to a discussion of the representation of the SOR in the ozone dataset being created for CMIP6 model simulations (Hegglin et al., in prep.).
Given the potential application of the results described below for use in climate model simulations, it is prudent to briefly review the typical requirements of an ozone database for models by describing the CMIP5 dataset as a representative example (Cionni et al., 2011) (see also Bodeker et al. (2013)).The CMIP5 ozone database provided monthly mean ozone mixing ratios on a regular latitude/pressure grid at a horizontal resolution of 5 • ×5 • (lon/lat) on 24 pressure levels covering 1000-1 hPa for the period 1850-2100.Data were provided on the following pressure levels: 1000, 850, 700, 600, 500, 400, 300, 250, 200, 150, 100, 80, 70, 50, 30, 20, 15, 10, 7, 5, 3, 2, 1.5, 1 hPa.Stratospheric ozone data (at p≤300 hPa) were given as zonal mean values.Therefore for any description of the SOR must fulfil these (or similar) criteria to be viable for use in climate models (i.e.global coverage at monthly mean resolution and with sufficient vertical and horizontal resolution throughout the stratosphere).

Ozone datasets
The satellite ozone datasets examined in this study are summarised in Table 1.A detailed overview of their spatial and temporal sampling characteristics and, where appropriate, their merging procedures is provided by Tummon et al. (2015) and references therein.Their main properties are briefly summarised below.Since our goal is to extract a signal with power on a quasi-decadal timescale, it is desirable to use the longest available timeseries and we therefore analyse all datasets for their full time periods.For the longest record considered, this amounts to approximately three solar cycles.

SAGE II based records
The SAGE II record forms the basis of many long-term ozone datasets (see e.g.Tummon et al. (2015)).As a limb-viewing instrument, the spatial and temporal sampling of SAGE is fairly sparse, with a given latitude measured approximately once per month; however, it is recognised as having good long-term stability and a vertical resolution of ∼1 km in the stratosphere, which are characteristics that are likely to be important for analysing the SOR.We use zonal and monthly mean ozone data from October 1984 to August 2005 provided through the WCRP/SPARC Data Inititive (SDI) (Tegtmeier et al., 2013).
The native retrieval coordinate of SAGE II is units of ozone number densities on altitude levels; data are post-processed to volume mixing ratios (vmr) on pressure levels using temperatures from a meteorological reanalysis dataset.The SAGE II retrieval algorithm was recently updated as part of the version 7.0 release (Damadeo et al., 2013).The SOR in SAGE II v6.2 data has been discussed in a number of studies: e.g.Randel and Wu (2007); Soukharev and Hood (2006); Gray et al. (2009) for mixing ratios, and Remsberg and Lingenfelser (2010) for number densities.Here we compare the SOR in the latest v7.0 release to the previous v6.2 in units of number densities and mixing ratios.It is important to conduct this comparison for both sets of units because the temperature record used to convert SAGE II to mixing ratios was changed between v6.2 and v7.0 from National Meteorological Center/National Center for Environmental Prediction (NMC/NCEP) to Modern Era-Retrospective Analysis for Research and Applications version 1 (MERRA-1) reanalysis data.The impact of this change on the SOR has not been previously evaluated and is described in Section 4.1.
As a solar occultation instrument, SAGE II profiles can be categorised as a sunrise (SR) or sunset (SS) measurement.There are known variations in the relative numbers of SR/SS retrievals over the SAGE II record.For example, SAGE II obtained profiles in two narrow latitude bands each day, 15 each at sunrise and sunset, but after November 2000 SAGE II measured only one profile per orbit at either SR or SS.These variations in SR/SS sampling have been shown to affect estimates of climatological ozone values due to diurnal cycle effects (Toohey et al., 2013), but could also affect temporal variability in monthly mean ozone values.To account for the possible effects of these sampling issues on the estimation of the SOR, we add an additional term to the multiple linear regression model for SAGE II data that represents the fraction of SR to total (SR+SS) profiles used to generate each monthly mean data point (see Section 3).
The SAGE II mission stopped measuring in 2005.Since then several satellite instruments have continued to measure ozone, and there are now a number of combined datasets that have extended SAGE II to near the present day.These datasets were recently analysed as part of the WCRP/SPARC SI 2 N activity to evaluate long-term ozone trends (see Tummon et al. (2015) and references therein), including SWOOSH (Stratospheric Water and OzOne Satellite Homogenized) (Davis et al., 2016), GOZCARDS (Froidevaux et al., 2015), SAGE-GOMOS (Global Ozone Monitoring by Occultation of Stars) (Kyrölä et al., 2015;Penckwitt et al., 2015), and SAGE-OSIRIS (Optical Spectrograph and Infrared Imager System) (Bourassa et al., 2014).As mentioned above, SAGE II mixing ratios are produced by conversion of number densities using an independent temperature record.The uncertainties in the SOR that result from using different stratospheric temperature records for this conversion are demonstrated in Section 4.1.This leads us to focus our analysis of the SOR on the extended records that provide ozone as number densities and are therefore less dependent on the conversion issues that accompany the choice of a particular temperature record (see Section 4.2).
Since SWOOSH and GOZCARDS currently only provide ozone mixing ratios we do not analyse them here.
The three extended ozone datasets all include SAGE II v7.0 number densities.Differences in the SOR between the datasets may therefore arise as a result of the more recent measurements used to extend SAGE II and/or from the methods used to merge the different satellite records.Two of the datasets extend SAGE II using GOMOS, which flew on the ENVISAT satellite and covers 2002-2011, but take different approaches for combining the two records.Kyrölä et al. (2015) use GOMOS as a reference and adjust SAGE II sunrise and sunset profiles separately at each latitude and altitude; this dataset will be referred to as SAGE-GOMOS 1. Conversely, Penckwitt et al. (2015) use SAGE II as a reference and adjust GOMOS data using seasonally-varying offsets at each latitude and altitude; this dataset will be referred to as SAGE-GOMOS 2. The third dataset analysed extends SAGE II using OSIRIS data and covers 1984-2013 (Bourassa et al., 2014;Sioris et al., 2014).Latitude and altitude dependent offsets are calculated for the deseasonalised data during the instrument overlap period (January 2002-August 2005), and the OSIRIS data are adjusted to produce a consistent combined SAGE II and OSIRIS timeseries.

SBUV based records
In addition to SAGE II, the other main long-term internally-calibrated satellite ozone dataset is comprised of data from the Backscatter Ultraviolet Radiometer (BUV) and Solar Backscatter Ultraviolet Radiometer (SBUV) instruments on board Nimbus satellites and the SBUV/2 instruments on various National Oceanic and Atmospheric Administration (NOAA) satellites.Data are available as mixing ratios on pressure levels from January 1970 to near the present day.As nadir-viewing instruments, the BUV/SBUV records have more frequent global coverage than the limb-viewing SAGE II, but their vertical resolution is at least an order of magnitude poorer at pressures greater than ∼15 hPa rendering it more difficult to resolve detailed ozone structures in the mid and lower stratosphere.
Since the entire BUV/SBUV record is comprised of multiple records from different satellites, interinstrument biases and drifts must also be accounted for to produce a homogenised record.
We analyse zonal and monthly mean data from the SBUV Merged Ozone Dataset (SBUVMOD) version 8.0 (VN8.0)dataset and the latest release SBUV VN8.6 (McPeters et al., 2013;Bhartia et al., 2013), thereby complementing previous analyses of the SOR (e.g.Soukharev and Hood (2006)).SBUVMOD VN8.0 covers the period 1970-2009 and was downloaded from http://acd-ext.gsfc.nasa.gov/Data_services/merged/data/sbuv.70-09.za.v8_prof.vmr.rev1.txt.Two versions of the SBUV VN8.6 record have been produced so far: the SBUVMOD VN8.6 dataset from NASA which covers 1970-2012 (Frith et al., 2014), and the SBUV Merged Cohesive dataset from NOAA which covers 1978-2012 (Wild and Long, 2015).These are identical to the datasets analysed as part of the SI 2 N activity (e.g.Tummon et al. (2015)).The two SBUV VN8.6 datasets contain some differences in the data that is included from different instruments within a particular period (see Figure 1 in Tummon et al. (2015)), and in the methods for averaging and merging these data.SBUV Merged Cohesive VN8.6 uses data from a single instrument in any time period; the individual records are then bias-corrected to produce a continuous record (Wild and Long, 2015).In contrast, SBUVMOD VN8.6 is constructed by averaging all available data within a particular time window (Frith et al., 2014).The SBUVMOD datasets extend back to 1970 by including data from the BUV instrument on Nimbus 4 from 1970-1976, whereas the SBUV Merged Cohesive dataset starts from 1978 with the first SBUV instrument on Nimbus 7.

The multiple linear regression model
Following numerous earlier studies (e.g.Frame and Gray (2010); Soukharev and Hood (2006); Mitchell et al. (2015a)), the SOR is diagnosed using multiple linear regression (MLR); this technique enables the signals associated with different forcings within a single timeseries to be separated.
The ozone data are first deseasonalised by removing the long-term monthly mean at each latitude and pressure (or altitude).As in past studies, we then perform an MLR analysis on the timeseries of monthly mean ozone anomalies at each location, O 3 (t), to diagnose the 11 year solar cycle component: where r(t) is a residual.The analysis mainly focuses on annual-mean signals, which are calculated by regressing all months as a single timeseries.
The monthly basis functions are: the F10.7cm radio solar flux (http://lasp.colorado.edu/lisird/tss/noaa_radio_flux.html), the CO 2 concentration at Mauna Loa ENSO is the main regressor for which a lagged response in stratospheric ozone might be expected; however, we find that the SOR is not sensitive to lagging the ozone anomalies with respect to the Nino 3.4 index by 0-12 months.We therefore do not include any lags in Equation 1.We have also tested the sensitivity of the diagnosed SOR to the use of a spatially-varying EESC field using output from the UM-UKCA chemistry-climate model REF-C1 CCMI integration.However, this has virtually no effect on the SOR compared to the use of a single EESC timeseries for all locations, and we therefore adopt the latter approach for simplicity.
We do not include a volcanic term in the regression model, but instead choose to exclude data from the 2 year periods following the two major tropical volcanic eruptions during the analysis epoch: El Chićhon (data excluded from April 1982-March 1984) and Mt.Pinatubo (data excluded from June 1991 -May 1993).These periods are excluded from the analysis for two reasons: firstly, some of the datasets analysed implicitly exclude data in these periods for quality control purposes, whereas others do not.For consistency, we therefore exclude these periods for all datasets.Secondly, removing these periods reduces the likelihood of aliasing between volcanic and solar signals, which can be an issue within relatively short climate data records (Chiodo et al., 2014).
We adopt the widely used F10.7cm solar flux as a proxy for solar activity in the MLR model.This is a more appropriate measure for variations in the UV spectral region, the key driver of the stratospheric ozone response, than other indices such as total solar irradiance (Gray et al., 2010); however, it should be noted that the F10.7cm flux is not a direct measurement of UV variability, but rather is a proxy for variations at these wavelengths.Throughout the manuscript the SOR is expressed as percent ozone change per 130 solar flux units (1 SFU = 10 −22 Wm −2 Hz −1 ) to represent the difference between the 11 year solar cycle maximum and minimum.
The 95% confidence intervals on the SORs are estimated by: where A is the solar regression coefficient in Equation 1, t α/2,n−(k+1) is the critical t-value at a confidence level, α, of 0.05 with degrees of freedom n − (k + 1) where n is the number of data points in the regression, k is the number of regressors, and C AA is the variance of the estimated solar regression coefficient A.
As mentioned in Section 2.1, the SAGE II record is affected by irregular SR and SS sampling as a function of time.This could introduce spurious variability in the monthly mean ozone values, particularly in the upper stratosphere, as a result of the diurnal cycle in ozone.However, many previous regression studies of SAGE II data have not accounted for the non-stationarity in SR/SS sampling (e.g.Randel and Wu (2007)).Here, we account for this by including an additional term in Equation 1 that quantifies the ratio between the number of SR and the total (SR+SS) number of profiles used to produce each monthly mean SAGE II data point; this index can take values between 0 and 1.An example of this index for the SAGE II v7.0 dataset at 1 hPa averaged over the tropics One important issue for MLR analysis is the handling of possible autocorrelation in the regression residuals and its effects on the estimation of statistical uncertainties.A Durbin-Watson test does not reveal significant autocorrelation in the regression residuals at most locations; however, this is likely to be because there is a considerable fraction of missing data points in many of the datasets analysed.In the analysis of chemistry-climate model simulations in Part II of this study, for which there is implicit complete spatial and temporal sampling, a Durbin-Watson test reveals significant serial correlation in the regression residuals in many locations for lags of one and two months, particularly in the lower stratosphere and mesosphere.This autocorrelation can lead to spurious overestimation of the statistical significance of the regression coefficients and we therefore include an autoregressive term in the MLR model.Given the significant serial correlation of the residuals in the chemistry-climate models at up to two months lag in some regions, a second order autoregressive noise process (AR2) is used, which assumes the residuals r(t) have the form: where a and b are constants and w(t) is a white noise process; this is the same approach employed in the recent SPARC SI 2 N analysis of ozone trends (Tummon et al., 2015;Harris et al., 2015).The inclusion of this term has a very minor effect on the results for the observational datasets in Part I, but has a greater effect for the model results in Part II.We therefore include it in the analysis here for consistency between both parts of the study.
4 Results The two SAGE II ozone mixing ratio datasets (black and red lines) are also in reasonable agreement for long-term changes in the mid-stratosphere (10 and 30 hPa).However, in the upper stratosphere (1 and 3 hPa) there are substantial differences in both short and long-term variations.For example, SAGE II v6.2 (black line) shows persistent negative anomalies in the early part of the record which are not evident in v7.0 (red line).These coincide with the 11 year solar cycle 21 minimum from 1985 to 1988.Furthermore, in the latter part of the record, v6.2 shows relatively large amplitude fluctuations with mean positive anomalies from 2002 to 2004, which coincide with the peak and subsequent declining phase of solar cycle 23.Thus, there are differences in the evolution of ozone between the two SAGE II mixing ratio datasets, particularly in the upper stratosphere.

The SOR in SAGE II datasets
Overall, the two versions of SAGE II number densities are in closer agreement than the mixing ratio data.
Figures 4(a to the southern subtropics.The uncertainties in the lower stratosphere between 22-28 km are smaller in magnitude, but this is partly because the SOR is also smaller here (note the confidence intervals are expressed as percent ozone to be directly comparable to Figure 4).Overall, the 95% confidence intervals are around 30-50% of the magnitude of the 'best estimate' SOR in Figure 4 indicating that there are considerable uncertainties in the SOR in the SAGE II datasets.This has implications for understanding the contribution of the SOR to the climate response to the solar cycle.The structures of the SOR between 20 and ∼7 hPa are also similar, with subtropical maxima of 1-2% and a distinct equatorial minimum.However, the SORs in the upper stratosphere are markedly different between v6.2 and v7.0.Polewards of ±20 • the structure of the SORs are similar in both datasets, but the magnitude is ∼1% larger in v6.2.In the tropics, the v6.2 data show a large peak in the SOR in the uppermost stratosphere of up to 5%, whereas the v7.0 data show a smaller SOR of 1% in this region.
The confidence intervals for the SAGE II mixing ratio SORs in Figures 5(c) and 5(d) are generally similar to those for number densities, with the exception of the uncertainties being considerably larger in the tropical upper stratosphere in both datasets, but particularly in SAGE II v6.2.The relatively large uncertainties in the 'best estimate' of the SOR would feed through to similar uncertainties in the contribution of the SOR to the atmospheric response to the 11-year solar cycle (Gray et al., 2009;Shibata and Kodera, 2005).It is therefore important to understand the causes of the differences in SOR between the SAGE II v6.2 and v7.0 datasets, since it presents a limitation for understanding and simulating the climate response to solar forcing (e.g.Ermolli et al. (2013); Mitchell et al. (2015b)).This is explored in the next section.

Differences in NMC/NCEP and MERRA-1 stratospheric temperature records
Since the two versions of SAGE II show comparable SORs for number densities, the differences between Figures 4(c) and 4(d) must be related to the conversion of SAGE II data to ozone mixing ratios.As described in Section 2, SAGE II v6.2 employed NMC/NCEP temperature data for this conversion, but this was changed to MERRA-1 for v7.0 (see Damadeo et al. (2013) for details).The differences in the SOR in the upper stratosphere must therefore be related to the use of different temperature records in the conversion.It is known that the evolution of stratospheric temperatures in some reanalyses show unphysical variability and trends (Mitchell et al., 2015a), and these have been corrected for in some solar-climate studies (e.g.Frame and Gray (2010); Hood et al. (2015)).
However, the effect of temperatures on the SOR in SAGE II data has not been considered previously.
Indeed, spurious variations in stratospheric temperatures in reanalyses datasets, which are introduced through changes in the observing system over time, could mask or enhance the signal of the 11 year solar cycle in SAGE II ozone mixing ratios.
Figure 6 shows timeseries of annual and tropical mean temperature anomalies at select stratospheric levels (1, 2, 5, 10, 30 hPa) for the NMC/NCEP and MERRA-1 datasets.The NMC/NCEP temperatures are those provided with the published SAGE II data files and cover 1985-2003.MERRA-1 data were downloaded for 1979-2013 from the NASA GFSC website.At 30 hPa, the evolution of the two temperature records is nearly identical during the period of overlap, with a long-term cooling trend of ∼0.6 K decade −1 that is strongly connected to an apparent step-wise cooling of ∼2 K between 1992 and 1994.However, at pressures less than 30 hPa there are substantial differences between the records.The NMC/NCEP data show exceptional behaviour between 2000-03.At 1 hPa, there is a warming of more than 3 K over this short period, which is coincident with a warming of ∼1 K at 2 hPa.In contrast, at 5 and 10 hPa there is a cooling of more than 4 and 2 K, respectively, over this period.The magnitude and vertical structure of these changes in the NMC/NCEP record seems inexplicable as to be related to any physical process, particularly when compared to the variations found in the remainder of the record.Some of these issues may be related to the method used to construct the NMC/NCEP temperature record itself.NCEP reanalysis data were only available for pressures greater than 10 hPa, requiring the addition of operational analyses to extend the data to the stratopause.Data from an atmospheric model was used to further extend the temperature data to the mesosphere, but these levels are not considered here (see e. and is also larger in amplitude than typical solar signals in temperature at this level (Mitchell et al., 2015a).However, the sign is at least consistent with the expected tendency of upper stratospheric temperatures during the declining phase of the solar cycle.
A valid question is which representation of past stratospheric temperatures is likely to be most realistic?Mitchell et al. (2015a) compared MERRA-1 to Stratospheric Sounding Unit (SSU) satellite data and found considerable differences in upper stratospheric temperature variability between the two records.However, the NMC/NCEP data show a long-term warming trend in the upper stratosphere, which is opposite to the cooling expected from increasing atmospheric CO 2 and declining ozone abundances over this period.Both records therefore appear to exhibit differences compared to observed stratospheric temperature changes.
The evolution of atmospheric temperatures will affect the geometric altitude of a given pressure surface, as well as the conversion from number density to mixing ratio.It is well known that cooling will lower the altitude of pressure surfaces, a so-called 'atmospheric shrinking' effect.Therefore the presence of cooling near the stratopause in MERRA-1 would tend to lead to a greater atmospheric shrinking than for the NMC/NCEP temperatures.Furthermore, the conversion from number density to mixing ratio is proportional to temperature, so a positive correlation between number density and temperature over the solar cycle would tend to increase the magnitude of the SOR on a given pressure surface.

Dependence of SOR in SAGE II mixing ratios on temperature record
To test the impact of the differences between NMC/NCEP and MERRA-1 temperatures on the SOR in SAGE II, we perform our own conversion of the SAGE II v6.2 data from number densities to mixing ratios.Each monthly and zonal mean ozone profile is first converted to number densities on pressure levels using the hydrostatic equation, and then to mixing ratios on pressure levels using the ideal gas law.The MLR in Equation 1 is then applied to the converted ozone mixing ratios to derive a SOR that can be compared to the published SAGE II mixing ratio datasets discussed above and shown in Figure 4.
As a first test, we convert SAGE II v6.2 number densities to mixing ratios using the full timeseries of temperatures from NMC/NCEP and MERRA-1 in turn.The SORs diagnosed from these 'post-hoc' converted datasets are shown in Figures 8(a) and 8(b) for NMC/NCEP and MERRA-1, respectively, with the difference between them shown in Figure 8(c).These can be compared to Figures 4(c)-4(e).We stress that differences in the SORs are to be expected, since in the published SAGE II datasets each ozone profile is converted separately before averaging is performed, whereas here we have converted the monthly, zonally and latitudinally averaged ozone number density profiles.
The SOR in the post-hoc converted data using NMC/NCEP temperatures (Figure 8 In conclusion, the SORs in SAGE II v6.2 and v7.0 are much more consistent in terms of number densities on altitude surfaces than they are for mixing ratios on pressure surfaces.The differences in SORs in the latter occur particularly in the upper stratosphere, and these have been shown to be sensitive to the details of the temperature records used for conversion.The long-term warming trend in the upper stratosphere in NMC/NCEP data is at odds with the understanding of recent changes in stratospheric composition and its impact on temperatures (Randel et al., 2009); however, the peak of the solar cycle signal in stratospheric temperatures in MERRA-1 is at lower altitude than predicted from theory and models.Recent analysis suggests that the relationship between ozone and temperature in the upper stratosphere that is anticipated from photochemical theory is more realistic for the SAGE II v7.0 mixing ratio data than for v6.2 (Dhomse et al., 2015).Nevertheless, there remain questions around which of the SAGE II mixing ratio datasets is likely to be most credible for diagnosing the SOR.These results raise issues for the representation of the SOR in the CMIP5 ozone database, which was largely based on SAGE II v6.2 mixing ratios (Cionni et al. (2011); see also Maycock et al., in prep.).

The SOR in extended SAGE II datasets
Given the uncertainties in the SOR for the SAGE II mixing ratio datasets discussed above, we focus our analysis of the extended SAGE II records on the three SI 2 N datasets that are currently available as number densities (see Section 2.1): SAGE-GOMOS 1, SAGE-GOMOS 2, and SAGE-OSIRIS.
Extending SAGE II using these more recent measurements increases the number of data points included in the MLR model by almost a factor of 2 in the tropics and by ∼50% in the subtropics (see Supplementary Material Figures S1 and S2). Figure 9 shows timeseries of monthly tropical percent ozone anomalies at select altitudes for the three SI 2 N datasets.The datasets do not agree perfectly over the SAGE II era (1984-2004) because the anomalies are defined relative to the entire timeseries, but overall they show similar behaviour to SAGE II v7.0 number densities (green line) in Figures 10(a-c) show the SORs in the three extended SAGE II datasets and Figures 10(d-f 2011)), differences in the 'best estimate' of the SOR between the datasets remain important to characterise.The differences in SOR between SAGE-GOMOS 1 and 2 must arise from differences in the data merging procedures, which are summarised by Tummon et al. (2015), and are described in detail by Kyrölä et al. (2015) and Penckwitt et al. (2015).Analysis of the SOR in the two SAGE-GOMOS datasets over the SAGE II period alone ) reveals similar differences in magnitude and structure (not shown), which suggests that the use of SAGE II or GOMOS as a reference to which the other record is adjusted is a key factor for the differences in SOR.The The SOR in the SAGE-OSIRIS dataset (Figure 10(c)) shows significant positive values in the subtropics between ∼30-40 km.This is consistent with the results of Bourassa et al. (2014) who conducted a similar MLR analysis to assess long-term ozone trends in SAGE-OSIRIS (see also Tummon et al. (2015)).However, the SOR is smaller and less significant in the tropical upper stratosphere and northern extratropics as compared to the two SAGE-GOMOS datasets and the SAGE II v7.0 data.Hubert et al. (2015) identified a significant positive drift of 5-8 % decade −1 in OSIRIS data above 35 km compared to ozonesondes and lidar measurements, which may contribute to the differences in SOR in the upper stratosphere.
Although there are broad similarities in the SOR between the three extended SAGE II datasets there are also some differences.This is despite the fact that all of the datasets use SAGE II v7.0 number densities as a basis.There is therefore a trade-off between generating the longest climate data record possible, which is desirable for analysing quasi-decadal signals, and the introduction of additional sources of uncertainty from combining multiple satellite records with different sampling properties and drifts.There appear to be variations in ozone in the OSIRIS record that reduce the magnitude of the SOR in the extended SAGE-OSIRIS record compared to the SAGE II period alone.When the SAGE-GOMOS datasets are analysed over the SAGE II period (1984-2004), SAGE-GOMOS 1 shows the greatest resemblance to the original SAGE II v7.0 data in Figure 4(b) (not shown) and we therefore conclude that this record is likely the most reliable estimate of the SOR from the datasets considered.

The SOR in SBUV records
Figure 11 shows timeseries of monthly percent ozone anomalies at select stratospheric levels (as in Figure 3) for the SBUVMOD VN8.0 (black line), SBUVMOD VN8.6 (red line), and SBUV Merged Cohesive VN8.6 (blue line) datasets.At 1 hPa, the ozone anomalies in the different datasets are in good agreement between 1979-1994. After 1994, the main differences are found between the SBUMOV VN8.0 and the two SBUV VN8.6 datasets, the latter being largely consistent with one another.In particular, SBUVMOD VN8.0 shows a larger positive trend in ozone from the mid-1990s to the mid-2000s than in the SBUV VN8.6 records; this partly coincides with the ascending phase of solar cycle 23.At 3 hPa, a comparison of the three SBUV records reveals somewhat different behaviour.Here, the SBUVMOD VN8.0 and SBUV Merged Cohesive VN8.6 datasets show more similar ozone variations, and instead the SBUVMOD VN8.6 is an outlier exhibiting a larger decline in ozone compared to the other two records of ∼7-8% over 1979-2012.At 5 hPa, the three SBUV datasets generally show similar temporal variations in ozone in the early and latter parts of the records, with some differences in offsets linked to different behaviours in the late 1990s and early 2000s when data come from the NOAA-11, 14, 16 and 17 satellites.At 30 hPa, the three SBUV records are largely consistent with one another in their short and long-term variations, with some exceptions during the 1990s when the data come mainly from the NOAA-11 and NOAA-14 satellites (see e.g.Tummon et al. (2015)).
Figures 12(a-c) show the annual mean SORs in the (a) SBUVMOD VN8.0,(b) SBUVMOD VN8.6, and (c) SBUV Merged Cohesive VN8.6 datasets.Figures 12(d-f) show the associated 95% confidence intervals in terms of percent ozone.All three SBUV records show a significant positive SOR in some parts of the upper stratosphere of up to 2-3%.The SOR in the tropical upper stratosphere is smaller and not highly statistically significant in SBUVMOD VN8.6, which is in contrast to the two other records and somewhat resembles the SOR in SAGE II v7.0 mixing ratios (Figure 4(d)).
The modifications to the data processing algorithm between SBUVMOD VN8.0 and SBUVMOD VN8.6 are documented by Bhartia et al. (2013); these include the use of new ozone absorption cross-sections, a new a priori ozone climatology, and a new cloud-height climatology.In addition, changes were also made to the inter-instrument calibration, which is now achieved at the radiance level during periods of overlap between the SBUV instruments (DeLand et al., 2012;Bhartia et al., 2013).It seems plausible that calibration changes could impact on the diagnosis of quasi-decadal variability in ozone, and it seems possible that the new processing procedure may have smoothed out part of the SOR in the upper stratosphere in SBUVMOD VN8.6.Note that the difference in SOR in the tropical upper stratosphere between the two SBUV VN8.6 records remains when SBUVMOD VN8.6 is analysed over the shorter 1978-2012 period (not shown), so this does not result from the inclusion of the early BUV measurements in SBUVMOD VN8.6.
The SORs in the three SBUV records show further differences between 10-50 hPa, with SBU-VMOD VN8.6 showing a larger and more significant SOR, particularly in the northern extratropics, while SBUV Merged Cohesive VN8.6 shows a weaker SOR.However, we note that the poor vertical resolution (∼10 km) of the SBUV instruments at pressures greater than ∼15 hPa makes it challenging to resolve features in the mid and lower stratosphere.Note that the confidence intervals for all the SBUV records are smaller than those for SAGE II based records (see Figure 5 and Figures 10(d-f)).
This is likely to be because the number of data points included in the MLR analysis is around 2-3 times higher for the SBUV datasets than for the SAGE records (see Supplementary Material Figures S1 and S3).
It is desirable for the purposes of e.g.chemistry-climate model evaluation to determine which SBUV dataset might be most reliable for estimating the annual mean SOR.Lean (2014) analysed total column ozone measurements from SBUVMOD VN8.0 and SBUVMOD VN8.6 and found a smaller SOR in SBUVMOD VN8.6 near-global column ozone, which appeared to be related to in-strument effects around the 1996 time frame.However, Hood (1997) analysed the SOR in total column ozone data and found that most of the signal is associated with ozone changes in the lower stratosphere that are linked to dynamical processes.Column ozone measurements are therefore unlikely to be particularly helpful for constraining the SOR in the upper stratosphere where differences are found amongst many of the datasets analysed here and where the SOR is strongly determined by photochemical processes.Tummon et al. (2015) analysed vertical profiles of long-term ozone trends in satellite datasets and found that SBUV Merged Cohesive VN8.6 showed negligible ozone trends at 2 hPa over 1984-1997, whereas almost all other datasets analysed, including SBUVMOD VN8.6, showed a significant decline of several percent per decade over this period.Instead, SBUV Merged Cohesive VN8.6 showed larger negative ozone trends that the other datasets between 5-10 hPa.Wild and Long (2015) and Tummon et al. (2015) explain how the adjustments used to combine data from the ascending node of NOAA-11 with NOAA-9 and NOAA-14 in SBUV Merged Cohesive VN8.6 were determined from the overlap of the descending node of NOAA-11 with NOAA-16 because of known issues with the quality of data from NOAA-9 and NOAA-14 (Kramavora et al., 2013).Since the NOAA-9 and NOAA-14 data coincide with the end of the trend analysis period used by Tummon et al. (2015), this could have had a particularly pronounced impact on their linear trend calculations, but may not be as important for diagnosing the SOR.
From the timeseries of 1 hPa ozone anomalies shown in Figure 11, it would appear that differences between the two SBUV VN8.6 datasets in the early 2000s may be more important for determining the differences in SOR in the tropical upper stratosphere.During this period, which coincides with the maximum of solar cycle 22, SBUVMOD VN8.6 shows persistently more negative ozone anomalies than SBUV Merged Cohesive VN8.6.Further analysis of the SOR for the period up to the year 2000 (not shown) does produce a slightly larger and more significant SOR in the tropical upper stratosphere in SBUVMOD VN8.6, but the magnitude is still ∼1% smaller than in SBUV Merged Cohesive VN8.6 indicating that the post-2000 period alone does not explain all differences between Figures 12(b) and 12(c).Based on the above factors, it is difficult to assert which of the SBUV VN8.6 datasets is likely to be most reliable for estimating the SOR.However, in practice the differences between the SORs in the tropical upper stratosphere in the SBUV records are small compared to the associated statistical uncertainties (Figures 12(d-f)) and small compared to the differences in SOR between the two SAGE II mixing ratio datasets in this region.We therefore conclude that using the longest climate data record is most favourable for diagnosing the SOR, particularly on seasonal timescales (see Section 4.4), and in this case that is SBUVMOD VN8.6.

Seasonality in the solar-ozone response
The analysis thus far has described the annual mean SOR in satellite ozone datasets.However, the SOR is expected to exhibit a seasonal dependence; for example, in regions close to photochemical steady-state the annual cycle in solar zenith angle would be expected to produce a larger SOR in the summer hemisphere (Haigh, 1994).Furthermore, given the hypothesis that solar variability modifies the strength of the stratospheric polar vortex (Kuroda and Kodera, 2002), there may also be seasonal signatures in the SOR arising from dynamical processes, particularly in the winter hemispheres.Seasonal variations in the SOR could potentially influence the overall climate response to solar forcing through coupling to radiation (e.g.Hood et al. (2015)), and it is therefore important to characterise these in observations and chemistry-climate models.
Constraining the SOR on seasonal timescales requires high spatial and temporal data coverage; this is to ensure that any seasonal component of the signal can be resolved, but also to increase the number of degrees of freedom (i.e. the number of data points) available for the regression.Such coverage is not adequately provided by limb-viewing instruments, such as SAGE II, which have relatively sparse and infrequent sampling.The coverage is considerably better for nadir-viewing instruments like SBUV; however, as described above their vertical resolution is much poorer in the middle and lower stratosphere.There is therefore a trade-off between the information that can be usefully extracted from different data sources.
Given the denser sampling of SBUV compared to SAGE II, we focus here on the SBUVMOD VN8.6 dataset to examine the seasonality of the SOR. Figure 13 shows the monthly SOR in SBU-VMOD VN8.6 for the period 1970-2012.These values are calculated by applying the MLR model to timeseries for individual months, and therefore no autocorrelation term has been included, since separate months are approximately uncorrelated from year-to-year.We note that the detailed magnitudes and structure of the monthly SORs are more sensitive to the choice of analysis epoch than for the annual mean SOR (not shown), but the broad features are generally consistent.The key point to take from Figure 13 is that there are substantially enhanced meridional and vertical gradients in the monthly SORs as compared to the annual mean SOR for SBUVMOD VN8.6 in Figure 12(b).This is similar to the conclusion reached by Hood et al. (2015).
Although much of the localised variations in the SOR are driven by dynamical processes, it is also possible that they could feedback onto circulation through the radiative impacts of ozone on stratospheric heating rates and temperatures.Hood et al. (2015) concluded that the three chemistryclimate models from CMIP5 that simulate strong gradients in ozone in the winter upper stratosphere, which most closely resemble observations, tend to have high latitude dynamical responses that are most similar to reanalysis data.Seasonal variations in the SOR may therefore play a role in the ability of a model to simulate the climate response to solar variability.However, given the tight coupling between ozone and dynamics, attribution of the importance of such radiative feedbacks is particularly challenging.To our knowledge, the importance of this two-way coupling for the climate response to solar variability has not been explicitly tested.This is important to clarify because it is not known whether it is sufficient to simply prescribe a seasonally-varying SOR, or whether a fully interactive chemistry-climate model is required to capture the coupling and feedbacks between composition, radiation and dynamics over the solar cycle.The representation of the SOR in global climate models is discussed in more detail in Part II of this study (Maycock et al., in prep.).

Conclusions
The solar-ozone response (SOR) forms an important part of the climate response to 11-year solar cycle variability through its impact on stratospheric temperatures (e.g.Shibata and Kodera (2005); Gray et al. (2009)).This papers forms the first of a two-part study that aims to quantify the SOR in current satellite observations and chemistry-climate models.Part I has focused on comparing the SOR in recently updated and/or extended versions of long-term satellite ozone datasets (e.g.SAGE II, SBUV) with their previous counterparts (e.g.Soukharev and Hood (2006); Austin et al. (2008)).
The SAGE II dataset has been widely used for ozone studies because of its long-term stability.
SAGE II ozone data are available as number densities on altitude levels and post-processed to mixing ratios on pressure levels.The SAGE II version 6.2 (v6.2) mixing ratio dataset shows a positive annual mean SOR with a peak magnitude of ∼5% near the tropical stratopause.However, the more recent SAGE II v7.0 dataset shows a substantially smaller SOR at the tropical stratopause of ∼1%.
Conversely, the SORs in the equivalent SAGE II number density datasets are much more consistent for v6.2 and v7.0, and show a three peaked structure in the tropics/subtropics with a magnitude of up to 3-4%.
By applying a post-hoc method to convert SAGE II number densities to mixing ratios, we have shown that the differences in SOR mostly arise from the change in independent temperature record used by the SAGE II team to convert number densities to mixing ratios: v6.2 uses NMC/NCEP and v7.0 uses MERRA-1 temperatures.Differences between these temperature records in both longterm trends and solar cycle variations contribute to the changes in SOR described above.Since both temperature records contain known issues (e.g.Damadeo et al. (2013); Mitchell et al. (2015a)), we conclude that the latest SAGE II v7.0 ozone number densities are likely to be most reliable for estimating the SOR at the present time.This is an important conclusion because several of the existing ozone datasets developed for use in global climate models have been based on SAGE II v6.2 mixing ratio data, including the dataset developed for CMIP5 simulations (Cionni et al., 2011).
We further analysed the annual mean SOR in three extended SAGE II datasets that have merged more recent GOMOS (2002-11) or OSIRIS (2002-13) data with SAGE II v7.0 number densities.Two SAGE-GOMOS datasets were analysed that adopt different methods for merging the satellite records (Kyrölä et al., 2015;Penckwitt et al., 2015).These records show broadly similar SORs, but the dataset that uses SAGE II as a reference and adjusts GOMOS using seasonally-varying offsets at each latitude and altitude (Penckwitt et al., 2015) was found to have a SOR with a noisier spatial structure.
The SAGE-OSIRIS dataset (Bourassa et al., 2014) shows a significant positive SOR of ∼2% between 30-40 km, but a weaker and less significant SOR in the tropical upper stratosphere than is found in the SAGE-GOMOS datasets.Thus the inclusion of OSIRIS data results in a markedly different SOR to that found in the SAGE II v7.0 number densities that underpin the first part of the record.Given these various issues, we conclude that the SAGE-GOMOS 1 dataset (Kyrölä et al., 2015) is likely to be the most reliable extended SAGE II dataset for estimating the SOR at the present time.
Analysis of the recently released SBUVMOD VN8.6 data produced by NASA show a smaller SOR in the tropical upper stratosphere by ∼1% compared to the previous SBUVMOD VN8.0 data (Soukharev and Hood, 2006).However, the SBUV Merged Cohesive VN8.6 dataset from NOAA, which takes a different approach for combining the individual SBUV VN8.6 records, shows a SOR that more closely matches SBUVMOD VN8.0.Nevertheless, the differences in the magnitude of the SOR between the various SBUV records are generally smaller than those between the SAGE II v6.2 and v7.0 mixing ratio datasets and are not highly statistically significant given the estimated uncertainties in the SOR from the regression model.We therefore suggest that the SBUVMOD VN8.6 dataset is most appropriate for analysing the SOR since it is the longest of the currently available SBUV records .
Analysis of the SOR on monthly timescales in the SBUVMOD VN8.6 dataset reveals larger horizontal and vertical gradients in the SOR, particularly in the winter extratropics.Hood et al. (2015) analysed CMIP5 models with interactive chemistry and concluded that the models with seasonal variations in the SOR that best matched observations simulated changes in high latitude zonal winds that more closely resemble reanalysis data.Seasonal variations in the SOR may therefore be important for the climate response to solar variability, but the quantitative importance of this feedback for stratospheric dynamics remains to be tested.
To allow for a realistic representation of the climate impacts of solar variability in models, simulations should include the effects of both the SOR and variations in spectral solar irradiance (Matthes et al., 2016).Our results raise issues for how to best represent the SOR in 'non-interactive' climate models for which the SOR much be externally prescribed.For example, ozone databases for climate models are usually created using a variety of ozone measurements, and therefore implicitly include a representation of the SOR that emerges from whichever combinations of data are included (e.g.Cionni et al. (2011); Bodeker et al. (2013)).However, the differences in the magnitude and structure of the 'best estimate' SOR between the various satellite datasets presented here would likely result in different climate responses to solar forcing.There is therefore a need for new studies to explore the effects of uncertainties in the SOR for climate simulation, particularly in light of the substantial, but largely unexplained, spread in climate responses to the 11 year solar cycle across CMIP5 models (Mitchell et al., 2015b;Hood et al., 2015).Table 1.Overview of the satellite ozone datasets used in this study.

Figure 3
Figure3shows timeseries of monthly and tropical (30 • S-30 • N) mean percent ozone anomalies from 1984 to 2004 at select stratospheric levels for SAGE II v6.2 and v7.0 in units of mixing ratios (on pressure surfaces) and number densities (on approximately equivalent altitude surfaces).Data are only plotted where at least 1/2 of the points within the tropical band have values in a given month.The lowest panel shows the monthly F10.7 cm solar flux for reference.The anomalies in the two ozone number density datasets (blue and green lines) are in close agreement in the mid-stratosphere (24, 31 and 36 km) both in terms of high frequency fluctuations and long-term changes.At 31 km, there are ozone variations that are consistent with a QBO influence.At 36 and 40 km, there are variations that are visibly in phase with the solar cycle, with relatively high ozone values from 1989 to 1992 during solar cycle 22 maximum, and lower ozone values from 1994 to 1998 during the cycle minimum.The data show greater variance in the early and later parts of the records and fluctuations in phase with the solar cycle are not evident from the timeseries alone.
Figures 4(a) and (b) show latitude-altitude plots of the SOR for SAGE II v6.2 and v7.0 number densities, respectively.The 95% confidence intervals for the SORs in Figure 4 expressed as percent

Figures 4
Figures 4(c) and 4(d) show equivalent plots to 4(a) and 4(b) for SAGE II in units of mixing ratios on pressure levels.The SORs between ∼50-10 hPa are very similar in the two versions and strongly resemble Figures 4(a) and 4(b), with a positive SOR in the tropical lower stratosphere of ∼1-2%.
g.Damadeo et al. (2013) for more details).The NMC/NCEP temperature record used to convert SAGE II is therefore constructed from several component datasets.Regardless of the exact cause, it seems likely that some of the temperature variations in the NMC/NCEP record are spurious and this may impact on the diagnosed SOR in the SAGE II v6.2 mixing ratio data.The temperature variations in MERRA-1 over the period 1985-2003 are generally smaller in magnitude than those found in NMC/NCEP, with the exception of a marked cooling at 1 hPa of ∼3 K between 2001-2003, which is opposite to what is seen in NMC/NCEP.This cooling in MERRA-1 leads the decline in solar forcing during the downward phase of solar cycle 23 by around a year, Figure 7 shows the annual mean solar cycle signals in stratospheric temperatures derived for the NMC/NCEP and MERRA-1 datasets over the period 1985-2003.Although the sign of the temperature signals are consistent in most regions, the maximum warming in the tropics at solar maximum occurs at 4 hPa in MERRA-1 as compared to 2 hPa in NMC/NCEP.The peak magnitude of the solar cycle temperature response is also around 25% smaller in MERRA-1 compared to NMC/NCEP.The impact of these differences on the SOR in SAGE II mixing ratio data are explored in the next section.
Figures 8(d) and 8(e) show the SOR for the SAGE II v6.2 data converted to mixing ratios using a monthly temperature climatology from MERRA-1 added to a latitude-height-time dependent linear trend and solar cycle term (see Figure 7) extracted from either NMC/NCEP (Figure 8(d)) or MERRA-1 (Figure 8(e)).The difference between Figures 8(d) and 8(e) is shown in Figure 8(f) for reference.Figures 8(d-f) are very similar to Figures8(a-c) indicating that the majority of the difference in SOR in Figure8(c) can be intepreted as due to differences in long-term trends and solar cycle variability in temperatures between NMC/NCEP and MERRA-1.Further tests (not shown)show that the diagnosed SORs are not affected by the choice of base temperature climatology (i.e.MERRA-1 or NMC/NCEP).The remaining panels Figures8(g-i) and 8(j-l) show equivalent results to Figures8(d-f), but with the conversion to mixing ratios performed with the temperature climatology added to either the linear trend (Figures8(g-i)) or solar cycle (Figures8(j-l)) components of temperature variability from the two datasets.In both of these further tests, the SOR in the tropical upper stratosphere is larger for the SAGE II data converted using NMC/NCEP data(Figures 8(g,j)).This indicates that both components of the temperature variability contribute to the differences in SOR in Figure8(c).

Figure 3 ,
Figure3, as expected.In the post-2004 period, where either GOMOS or OSIRIS data are included, the datasets show generally consistent behaviour in the mid-stratosphere during the overlap period up to 2011.QBO-like variations in ozone are visible in the timeseries at 24 and 31 km.At 36 km, there is a decline in ozone from 2004-09 in all three datasets, with increases subsequent to this.However, in the upper stratosphere (48 km) there are more substantial differences between the datasets, particularly between the SAGE-GOMOS and SAGE-OSIRIS records.SAGE-OSIRIS shows mean positive anomalies from 2004-13, particularly in the latter part of the record, whereas the two SAGE-GOMOS datasets show negative anomalies between 2007-10, which coincide with the minimum of solar cycle 23.These differences in ozone variability during the post-SAGE II period may affect the SORs diagnosed in the extended datasets, as compared to that found for the SAGE II v7.0 data alone(Figure 4(b)).
Figures10(a-c) show the SORs in the three extended SAGE II datasets and Figures10(d-f) show their associated 95% confidence intervals in terms of percent ozone.An indication of the importance of how the satellite records are merged for the SOR can be seen by comparing Figures10(a) and 10(b), which show the SOR in SAGE-GOMOS 1 and SAGE-GOMOS 2, respectively.The SOR in SAGE-GOMOS 1 shows a generally smoother spatial structure as compared to SAGE-GOMOS 2, although the magnitudes are not distinguishable from one another given the estimated confidence intervals (Figures10(d-e)).Nevertheless, since statistical uncertainties in the SOR are not typically accounted for in solar-climate studies (e.gGray et al. (2009)) or in climate model ozone datasets (e.g.Cionni et al. (2011)), differences in the 'best estimate' of the SOR between the datasets remain uncertainties in the SOR in SAGE-GOMOS 2 (Figure 10(e)) are similar to those found in the SAGE II v7.0 number density dataset (Figure 5(b)), whereas the magnitude of the uncertainties in the SOR in SAGE-GOMOS 1 (Figure 10(d)) are reduced compared to SAGE II v7.0, particularly in the upper stratosphere.
on the Earth's climate (TOSCA) for a Short-term Scientific Mission to GEOMAR in September 2014 which initiated this work.Parts of the work at GEOMAR Helmholtz Centre for Ocean Research Kiel was performed within the Helmholtz-University Young Investigators Group NATHAN, funded by the Helmholtz-Association and GEOMAR.We thank Stacey Frith for providing useful information about the SBUV records.We also thank 705 the many instrument scientists and groups who have contributed to the development of the merged SAGE-GOMOS 1, SAGE-GOMOS 2, and SAGE II OSIRIS datasets, and for having made their data available for this study.

Figure 1 .
Figure 1.Timeseries of the six basis functions used in the MLR analysis.(a) Solar forcing based on F10.7cm solar radio flux; (b) a trend term based on the monthly CO2 concentration at Mauna Loa; (c) Equivalent effective stratospheric chlorine; (d) the Nino 3.4 index for ENSO; (e, f) two QBO indices based on tropical zonal winds at 50 and 30 hPa.The timeseries are in units of standard deviation and the time period is 1970-2015.A volcanic term is not included because the 2 year periods following the two major tropical eruptions in this epoch (El Chićhon and Mt Pinatubo) are excluded from the regression analysis.

Figure 2 .
Figure 2. Timeseries of the fraction of sunrise to total (sunrise + sunset) profiles used to generate monthly mean ozone values in the tropics (30 • S-30 • N) at 1 hPa for the SAGE II v7.0 vmr dataset.

Figure 4 .
Figure 4.The percent (%) annual solar-ozone response (SOR) (per 130 SFU) for the (a, d) SAGE II v6.2 data and (b, e) SAGE II v7.0 data in terms of (a, b) number density-altitude units and (d, e) volume mixing ratiopressure units.Panel (c) shows (b) minus (a), and panel (f) shows (e) minus (d).The contour interval is 1%.The hatching denotes regions where the SOR is not statistically distinguishable from zero at the 95% confidence level.

Figure 5 .
Figure 5.The 95% confidence intervals (CI 95% ) on the SORs (SOR±CI 95% ) shown in Figure 4 for the (a, c) SAGE II v6.2 data and (b, d) SAGE II v7.0 data in terms of (a, b) number density-altitude units and (c, d) volume mixing ratio-pressure units.The contour interval is 0.5%.The hatching is as in Figure 4.

Figure 6 .
Figure 6.Timeseries of tropical mean temperature anomalies (K) from the NMC/NCEP (dashed) and MERRA-1 (solid) datasets for (top-to-bottom) 1, 2, 5, 10, 30 hPa, respectively.The time period is 1979-2013.The thick red lines denote the periods excluded from the MLR analysis following major volcanic eruptions.The bottom panel shows the F10.7cm solar flux for reference.

Figure 7 .
Figure 7. 11 year solar cycle signals in temperature (K) from the (a) MERRA-1 and (b) NMC/NCEP datasets.Shading as in Figure 4.The contour interval is 0.25 K.These temperature fields are used in the 'post-hoc' conversion of SAGE II v6.2 number densities to mixing ratios (see Section 4.1.2for details).

Figure 8 .
Figure 8.The percent (%) annual solar-ozone response (SOR) (per 130 SFU) in SAGE II v6.2 data converted from number densities to mixing ratios for the period 1985-2003 using the method described in Section 4.1.2.The conversion is first conducted using full timeseries of monthly (a) NMC/NCEP and (b) MERRA-1 temperatures.Panel (c) (b) minus (a).A comparison of panels (a-c) with Figures 4(a-c) demonstrates the performance of the 'post-hoc' conversion.(d-f) As in (a-c) but with the number density to mixing ratio conversion performed using a monthly temperature climatology from MERRA-1 added to a linear trend and solar signal in stratospheric temperatures extracted from (d) NMC/NCEP and (e) MERRA-1.The remaining rows show the same as (d-f) but with the conversion performed with the (g-i) linear trend or (j-l) solar cycle temperature terms alone.The shading is as is Figure 4.The contour interval is 1% in the left and middle columns and 0.5% in the right-hand column.

Figure 10 .
Figure 10.(a-c) As in Figure 4, but for the extended SAGE II number density datasets: (a) SAGE-GOMOS 1, (b) SAGE-GOMOS 2, (c) SAGE-OSIRIS.SORs are derived for different periods as stated in the headers.The contour interval is 1%.(d-f) As in Figure 5, but for the datasets as shown in (a-c).The contour interval is 0.5%.

Figure 13 .
Figure 13.The percent (%) monthly solar-ozone response (SOR) (per 130 SFU) in the SBUVMOD VN8.6 dataset for the period 1970-2012.The contour interval is 1%.The grey shading denotes regions where the SOR is not statistically distinguishable from zero at the 95% confidence level.