Comparison of VLT/X-shooter OH and O2 rotational temperatures with consideration of TIMED/SABER emission and temperature profiles

Rotational temperatures Trot derived from lines of the same OH band are an important method to study the mesopause region near 87 km. To measure realistic temperatures, the rotational level populations have to be in local thermodynamic equilibrium (LTE). However, this might not be fulfilled, especially at high emission altitudes. In order to quantify possible non-LTE contributions to the OH Trot as a function of the upper vibrational level v', we studied a sample of 343 echelle spectra taken with the X-shooter spectrograph at the Very Large Telescope at Cerro Paranal in Chile. These data allowed us to analyse 25 OH bands in each spectrum. Moreover, we could measure lines of O2b(0-1), which peaks at 94 to 95 km, and O2a(0-0) with an emission peak at about 90 km. Since the radiative lifetimes are relatively long, the derived O2 Trot are not significantly affected by non-LTE contributions. For a comparison with OH, the differences in the emission profiles were corrected by using OH emission, O2a(0-0) emission, and CO2-based temperature profile data from the multi-channel radiometer SABER on the TIMED satellite. For a reference profile at 90 km, we found a good agreement of the O2 with the SABER-related temperatures, whereas the OH temperatures, especially for the high and even v', showed significant excesses with a maximum of more than 10 K for v' = 8. We could also find a nocturnal trend towards higher non-LTE effects, particularly for high v'.


Introduction
Studies of long-term trends as well as short-term variations of the temperature in the mesopause region are often based on so-called rotational temperatures T rot derived from emission lines of the hydroxyl (OH) airglow (e.g. Beig et al., 2003Beig et al., , 2008Khomich et al., 2008;Reisin et al., 2014), which is mostly caused by the chemical reaction of hydrogen (H) and ozone (O 3 ) (Bates and Nicolet, 1950) at a typical height of 87 km (e.g. Baker and Stair, 1988). Ground-based instruments usually measure the intensities of a few lines with different rotational upper levels N ′ of a single band (e.g. Beig et al., 2003;Schmidt et al., 2013). Assuming a Boltzmann distribution of the populations at the different rotational levels, a temperature can be derived, which is reliable if the frequency of thermalising collisions is high compared to the natural radiative lifetime of a state. The latter ranges from about 58 to 5 ms for the relevant upper vibrational levels v ′ = 1 to 9 of the OH electronic ground state (Xu et al., 2012). For the OH peak emission height, this appears to be sufficiently long to establish the assumed Boltzmann distribution (Khomich et al., 2008).
However, at higher altitudes with lower air density, collisions may be too infrequent, especially for the higher v ′ . Hence, the measured T rot can deviate from the true temperatures due to the contribution of a population of OH molecules, where the condition of a local thermodynamic equilibrium (LTE) for the rotational levels is not fulfilled. Comparisons of T rot from various bands differing in v ′ show clear signatures of such non-LTE effects (Cosby and Slanger, 2007;Noll et al., 2015). The measured T rot tend to increase with v ′ and even v ′ appear to indicate higher T rot than the adjacent odd v ′ . For this reason, the highest T rot are found for OH bands with v ′ = 8. As discussed by Noll et al. (2015), this can only be explained by a significant impact of a nearly nascent vibrational and rotational level distribution related to the hydrogen-ozone reaction, which mostly populates v ′ = 9 and 8 (Charters et al., 1971;Ohoyama et al., 1985;Adler-Golden, 1997). The differences in the results for even and odd v ′ can then be understood by the higher initial T rot for v ′ = 8 (Llewellyn and Long, 1978) and the high probability of radiative transitions with ∆v = 2 (Mies, 1974;Rousselot et al., 2000;Khomich et al., 2008). The T rot non-LTE contributions also appear to vary with time, as observed by Noll et al. (2015). At Cerro Paranal (24.6 • S), they tend to enhance during the night when data of several years are averaged. This is probably caused by a rising OH emission layer, which can be inferred from a decrease of the measured OH intensity (Yee et al., 1997;Melo et al., 1999;Liu and Shepherd, 2006;von Savigny, 2015). Apart from v ′ , the measured T rot are also strongly affected by the choice of lines. Higher N ′ tend to show stronger non-LTE contributions (Pendleton et al., 1989(Pendleton et al., , 1993Perminov and Semenov, 1992;Cosby and Slanger, 2007). For this reason, only lines of the first three or four rotational levels are usually used for T rot derivations (e.g. Beig et al., 2003;Perminov et al., 2007;Cosby and Slanger, 2007;Schmidt et al., 2013). However, even this restriction is not sufficient to obtain lineindependent T rot . The OH electronic ground level has two substates, X 2 Π 3/2 and X 2 Π 1/2 . Population differences of several per cent were found by Noll et al. (2015) for rotational levels of the same substate and for comparisons between the two substates.
The approach to estimate temperatures in the mesopause region by means of airglow lines with different rotational upper levels can also be applied to bands of molecular oxygen (O 2 ). However, long ground-based time series are only available for O 2 (b 1 Σ + g − X 3 Σ − g )(0-1) at about 0.865 µm (e.g. Khomich et al., 2008;Reisin et al., 2014), which is sufficiently bright, hardly blended by OH lines, and not affected by self-absorption in the lower atmosphere like the stronger (0-0) band. Since O 2 b(0-1) remains unresolved with the standard instruments in use, a simulation of the temperature dependence of the band shape is required (e.g. Scheer and Reisin, 2001;Shiokawa et al., 2007;Khomich et al., 2008). In contrast to OH, the radiative lifetime τ of O 2 b(0-1) is relatively long. Using the Einstein-A coefficients from the HITRAN2012 molecular line database (Rothman et al., 2013), τ is about 210 s. This suggests that O 2 b(0-1) T rot are not significantly affected by non-LTE contributions. The b 1 Σ + g state is essentially populated by atomic oxygen recombination and subsequent collisions (Greer et al., 1981), which results in an emission profile with a peak at about 94 to 95 km (derived from the (0-0) band; Watanabe et al., 1981;Greer et al., 1986;Yee et al., 1997), i.e. O 2 b(0-1) is sensitive to a different part of the mesopause region compared to OH. With full widths at half maximum (FWHM) of 8 to 9 km (Baker and Stair, 1988;Yee et al., 1997) for both kinds of emission, there is only a minor overlap of their profiles.
For a comparison with OH, transitions from the a 1 ∆ g state of O 2 to the ground state X 3 Σ − g are more promising since the measured nighttime emission peak heights of the (0-0) band at 1.27 µm are close to those of OH (McDade et al., 1987;López-González et al., 1989;Lipatov, 2013). During most of the night, they are at about 90 km. At the beginning of the night, the effective emission heights can be several km lower. This behaviour is caused by the contribution of a daytime-related a 1 ∆ g excitation by ozone photolysis to the nighttime-related atomic oxygen recombination (Vallance Jones and Gattinger, 1963;McDade et al., 1987;López-González et al., 1989. The former is relevant for the night since the spontaneous radiative lifetime of a 1 ∆ g is about 75 min due to an Einstein-A coefficient of about 2.2 × 10 −4 s −1 for the (0-0) band (Lafferty et al., 1998;Newman et al., 1999;Gamache and Goldman, 2001). The daytime emission peaks at about 50 km (Evans et al., 1968;Howell et al., 1990) and effectively rises in altitude after sunset up to the nighttime peak due to the faster decay of the a 1 ∆ g population by collisional quenching at lower altitudes. Although non-LTE effects are negligible, only Mulligan and Galligan (1995) studied T rot based on the bright O 2 a(0-0) band. Their sample consists of groundbased observations with a Fourier transform spectrometer at Maynooth, Ireland (53.2 • N), and covers 91 twilight periods over 18 months. The T rot measurements required simulations of the band for different ambient temperatures due to the strong line blending at their resolution. Moreover, O 2 a(0-0) is strongly affected by self-absorption, which required careful radiative transfer calculations. These efforts might explain the lack of O 2 a(0-0) T rot in the literature. As an alternative, O 2 a(0-1) can be used. Although this band is fairly weak and partly blended with strong OH lines, it was studied by Perminov and Lipatov (2013) using spectroscopic data taken at Zvenigorod, Russia (55.7 • N), from 2010 to 2011. The band emission had to be modelled to retrieve T rot .
In this paper, we will compare T rot derived from 25 OH bands, O 2 b(0-1), and O 2 a(0-0) to quantify the non-LTE effects related to the OH measurements. For this purpose, we will use 343 observations from the X-shooter echelle spectrograph (Vernet et al., 2011) at the Very Large Telescope (VLT) of the European Southern Observatory (ESO) at Cerro Paranal in Chile (2635 m, 24 • 38 ′ S, 70 • 24 ′ W). The OH data were already discussed by Noll et al. (2015). As X-shooter covers the wavelength range from 0.3 to 2.5 µm with a resolving power of at least λ/∆λ ≈ 3300, the different OH and O 2 bands can be studied in parallel and without the need of band simulations for the T rot retrieval. The resolution is sufficient to measure individual lines even in O 2 a(0-0).
We will not directly compare the derived T rot since the studied bands probe different parts of the mesopause temperature profile due to the differences in the altitude distributions of the emission. In particular, the significant changes in the emission profile of O 2 a(0-0) (López-González et al., 1992;Gao et al., 2011;Lipatov, 2013) have to be corrected. For this reason, we will use volume emission rate (VER) and temperature profiles from satellite observations to derive temperature corrections depending on the VER differences of the bands to be compared. For this purpose, data of the ten-channel broad-band radiometer SABER (Russell et al., 1999) on the TIMED satellite launched in 2001 are most suited. The limb sounder measures profiles of the crucial O 2 a(0-0) band and two OH-related channels at 1.64 and 2.06 µm, which probe different upper vibrational levels v ′ (4/5 and 8/9; see Baker et al., 2007). The latter is important since the OH emission peak heights depend on v ′ (Adler- Golden, 1997;Xu et al., 2012;von Savigny et al., 2012). The temperature profiles are derived from the SABER CO 2 channels under the consideration of non-LTE effects (Mertens et al., 2001(Mertens et al., , 2004Remsberg et al., 2008;Rezac et al., 2015). TIMED has a slowly-precessing polar orbit, which allows a 24 h coverage at low and mid-latitudes within about 60 days (Russell et al., 1999). Since this results in only two narrow local time intervals for a given site and day of year, our study will focus on average X-shooter and SABER data for certain periods.
The paper is structured as follows. We will start with the description of the X-shooter and SABER data used for this study (Sect. 2). Then, we will describe line measurements and T rot retrieval from the X-shooter data, the derivation of SABER-based profiles suitable for the measured airglow bands, and the correction of the corresponding T rot for the differences in the VER profiles (Sect. 3). In Sect. 4, we will discuss the resulting estimates of the non-LTE effects related to the OH T rot and their change with nighttime and season. Finally, we will draw our conclusions (Sect. 5).

X-shooter
The X-shooter instrument (Vernet et al., 2011) of the VLT is an echelle slit spectrograph that covers the very wide wavelength range from 0.3 to 2.5 µm by three arms called UVB (0.30 to 0.59 µm), VIS (0.53 to 1.02 µm), and NIR (0.99 to 2.48 µm). This allows one to study various airglow bands si-multaneously, as demonstrated by Noll et al. (2015), who investigated 25 OH(v ′ -v ′′ ) bands between 0.58 and 2.24 µm with v ′ from 2 to 9. Since X-shooter is an instrument for observing astronomical targets, airglow studies have to rely on archival, unprocessed spectra, which differ in terms of line of sight, exposure time, and slit-dependent spectral resolution. For this study, we have used the same data set that was selected and processed by Noll et al. (2015). The extended analysis of these data, therefore, focuses on the O 2 b(0-1) and O 2 a(0-0) bands (see Sect. 3.1). Noll et al. (2015) reduced archival data from October 2009 (start of the archive) to March 2013 using version V2.0.0 of the ESO public pipeline (Modigliani et al., 2010). The resulting wavelength-calibrated two-dimensional (2-D) spectra were then collapsed into a 1-D spectrum by applying a median along the spatial direction. For the flux calibration, response curves were created based on data of spectrophotometric standard stars (Moehler et al., 2014) corrected for molecular absorption  and atmospheric extinction . Due to the instability of the flat-field calibration lamps for the correction of the pixel-to-pixel sensitivity variations in the 2-D echelle spectra, the fluxes of long wavelengths beyond 1.9 µm including the OH bands (8-6) and (9-7) are relatively uncertain. For 77 % of the data, this could be improved by means of spectra of standard stars with known spectral shapes, i.e. Rayleigh-Jeans slopes.
From the processed data, a sample of 343 VIS-and NIRarm spectra was selected. The selection criteria were a minimum exposure time of 3 min, a maximum slit width of 1.5 ′′ corresponding to λ/∆λ ≈ 5400 for the VIS arm and 3300 for the NIR arm, and a difference in start, end, and length of the exposure between VIS and NIR arm of less than 5 min. The latter was necessary since the observations in the different X-shooter arms can be performed independently. Moreover, 21 % of the combined VIS-and NIR-arm spectra were rejected due to unreliable T rot in at least one of the 25 investigated bands. This can be caused by low data quality, an erroneous data reduction, or critical contaminations by residuals of the observed astronomical object. 36 spectra of the final sample could only be used up to 2.1 µm, which excludes OH(9-7), because a so-called K-blocking filter was applied (Vernet et al., 2011). The 343 selected spectra show a good coverage of nighttimes and seasons and, therefore, appear to be representative of Cerro Paranal. For more details, see Noll et al. (2015).

SABER
Our analysis of VER and temperature profiles is based on version 2.0 of the limb sounding products of the SABER multi-channel photometer on TIMED (Russell et al., 1999). We used VER profiles from the 1.27 µm channel of O 2 a(0-0) and the two OH-related channels centred on 1.64 and 2.06 µm. We took the "unfiltered" data products,  . SABER kinetic temperature Tkin in K as a function of the longitude distance to Cerro Paranal λ − λCP in degrees for the altitudes 85 (long dashed), 90 (solid), and 95 km (short dashed) and bin sizes of 4 • and 1 km. The averages were derived from SABER data for the latitudes between 23 • to 26 • S, the years from 2009 to 2013, and zenith distances of at least 100 • . The final SABER sample selection criterion |λ − λCP| ≤ 10 • is visualised by vertical bars.
which are corrected VERs that also consider the emission of the targeted molecular band(s) outside the filter (Mlynczak et al., 2005). For O 2 a(0-0), the correction factor is 1.28 . Finally, we used the SABER kinetic temperature T kin profiles. In version 2.0, they are based on the limb radiance measurements in the 4.3 and 15 µm broad-band channels and a complex retrieval algorithm involving non-LTE radiative transfer calculations (Rezac et al., 2015). For our analysis, we could use the VER and T kin profiles between 40 and 110 km because only for this altitude range all data are available. The lower and upper limits are related to the VERs and T kin , respectively. We mapped all profiles (having a vertical sampling of about 0.4 km) to a regular grid with a step size of 0.2 km.
To cover a similar period as the X-shooter sample (see Sect. 2.1), we only considered SABER data of the years 2009 to 2013. The X-shooter data set is a nighttime sample with a minimum solar zenith distance z ⊙ of 104 • . For this reason, we only selected SABER data with a minimum z ⊙ of 100 • . With this limit, it is guaranteed that the O 2 a(0-0) main emission peak is located in the mesopause region, i.e. it is typical of nighttime conditions. The fast change of the O 2 a(0-0) VER profile at twilight is illustrated in Fig. 1. As discussed in Sect. 1, the daytime population related to ozone photolysis with a maximum around the stratopause decays after sunset. The late nighttime profile remains relatively constant until the middle atmosphere is illuminated by the sun at dawn.
Since airglow emissions and temperatures significantly depend on geographic location (e.g. Yee et al., 1997;Melo et al., 1999;Hays et al., 2003;Marsh et al., 2006;Xu et al., 2010) due to differences in the solar irradiation and activity of tides, planetary and gravity waves, we selected only SABER data within the range from 23 • to 26 • S. Cerro Paranal is located at 24.6 • S. With this criterion and the exclusion of 14 profiles which were suspicious in terms of the VER peak positions for the two OH channels, the resulting sample includes 27 796 measurements. We further restricted the data set by also considering longitudinal variations. Figure 2 shows the SABER T kin for the altitude levels 85, 90, and 95 km as a function of longitudinal distance λ − λ CP to Cerro Paranal (70.4 • W). On average, there are variations of several K and changes of the variability pattern depending on altitude. For this reason, we limited the sample by the criterion |λ − λ CP | ≤ 10 • , which resulted in 1685 profiles. A further reduction of the longitude range is not prudent because then there could be too few profiles for representative results for time and/or seasonal averages. We also checked the results presented in the subsequent sections with respect to the longitude limit (see Sect. 3.3). Indeed, the agreement of X-shooter and SABER variations does not significantly improve with more restrictive limits. Also note that the SABER tangent point can easily move by more than 100 km during a profile scan. Moreover, the TIMED yaw cycle of 60 days (Russell et al., 1999) causes large gaps in the distribution of local times and days of year. For odd months (e.g. March), the period around midnight is well covered at Cerro Paranal. Even months show a better evening/morning coverage. Therefore, limitations in the correlation of X-shooter and SABER data are unavoidable.

Analysis
This study mainly focuses on the quantification of non-LTE contributions to OH T rot . The following section describes the different steps of this analysis, which relies on X-shooterrelated OH and O 2 T rot measurements and the SABERrelated correction of temperature differences by different band-specific emission profiles.

Line intensities and rotational temperatures
The wavelength range, spectral resolution, and quality of the X-shooter spectra (Sect. 2.1) allowed us to derive T rot from line intensities of 25 OH bands, O 2 b(0-1), and O 2 a(0-0).

OH
The T rot related to OH are based on the line selection and intensity measurements of Noll et al. (2015). The selection only includes lines originating from rotational levels N ′ ≤ 4 of both electronic substates X 2 Π 3/2 and X 2 Π 1/2 to minimise the non-LTE effects that increase with N ′ (Cosby and Slanger, 2007;Khomich et al., 2008). Moreover, it focuses on P -branch lines (related to an increase of the total angular momentum by 1) since these lines are well separated in the X-shooter spectra. The OH Λ doublets (e.g. Rousselot et al., 2000) are not resolved. Due to reasons like blending by airglow lines of other bands or strong absorption in the lower atmosphere (e.g. by water vapour), the number of suitable P -branch lines varied between 2 and 8 for the 25 considered OH bands. In the case of OH(6-4), an additional R-branch line was measured in order to have a minimum of three lines for quality checks. The measured intensities of all the lines were corrected for molecular absorption in the lower atmosphere by means of a transmission curve for Cerro Paranal computed with the radiative transfer code LBLRTM (Clough et al., 2005) by Noll et al. (2012). The absorption by water vapour was adapted based on data for the precipitable water vapour (PWV) retrieved from standard star spectra . For the OH line shape, Doppler-broadened Gaussian profiles related to a temperature of 200 K were assumed.
The measurement of T rot is based on the assumption of a Boltzmann distribution of the N ′ -related populations, which are derived from I/(g ′ A), i.e. the measured intensity I divided by the N ′ -dependent statistical weight g ′ (number of sublevels) times the Einstein-A coefficient for the line. In this case, the natural logarithm of this population of a hyperfine structure level of N ′ , which we call y, is a linear function of the level energy E ′ . The line slope is then proportional to the reciprocal of T rot (Meinel, 1950;Mies, 1974). Noll et al. (2015) used the molecular parameters E ′ , A, and g ′ from the HITRAN2012 database (Rothman et al., 2013), which are based on  for the investigated lines (Rothman et al., 2009), and from van der Loo andGroenenboom (2007, 2008). We will mostly focus on the HITRAN-based parameters since they appear to provide more reliable T rot . Noll et al. (2015) corrected the resulting T rot for the non-LTE effects caused by the different band-dependent line sets (see Sect. 1) by introducing a reference line set consisting of the three P 1 -(X 2 Π 3/2 ) and P 2 -branch lines (X 2 Π 1/2 ) with N ′ ≤ 3. The temperature changes for the transition from the actual to the reference line set were calculated based on the sample mean T rot of the 16 most reliable bands. The resulting set-specific ∆T rot were then applied to the temperatures for the individual observations. For this study, we used a different reference line set only including the three P 1 -branch lines. The corresponding results are easier to interpret since the mixing of P 1 -and P 2 -branch lines causes a negative non-LTE effect, i.e. T rot decreases by the lower relative population of the X 2 Π 1/2 state, which has higher E ′ than X 2 Π 3/2 . We calculated the correction to the new reference set based on the four very reliable OH bands (4-1), (5-2), (6-2), and (7-4) (cf. Noll et al., 2015). This choice is not critical since the slight v ′ dependence of the energy differences for the rotational levels has only a minor impact on the resulting ∆T rot . The T rot only based on the P 1 branch are 3.0 ± 0.3 K higher than those given by Noll et al. (2015) if the molecular parameters are taken from HITRAN. In the case of molecular data from van der Loo and Groenenboom (2008), ∆T rot is 2.6 ± 0.4 K. If we focused on the fainter P 2 -branch lines, the corrections would be 4.3 ± 1.0 K and 5.9 ± 2.5 K, respectively.
Our study is based on mean T rot derived from bands with the same v ′ , which show very similar intensity and T rot variations. We took the v ′ -specific T rot from Noll et al. (2015) (corrected for the different reference line set) for each of the 343 sample spectra. Note that the v ′ = 9 mean values of 36 spectra with reduced wavelength range do not include OH(9-7) (see Sect. 2.1). The effect on the sample mean T rot is negligible. The very faint OH(8-2) band is only considered by a fixed ∆T rot for v ′ = 8 derived from the sample mean (see Noll et al., 2015). Excluding errors in the molecular parameters, the uncertainties of the T rot (v ′ ) based on the sample mean intensities are between 1.0 K for v ′ = 8 and 2.0 K for v ′ = 2. These values are mainly based on the T rot variation for fixed v ′ and are related to HITRAN data. For molecular data from van der Loo and Groenenboom (2008), the corresponding uncertainties range from 1.8 K for v ′ = 9 to 3.6 K for v ′ = 2. The T rot uncertainties for individual observations are higher than the stated values due to measurement errors depending on the signal-to-noise ratio.

O
As discussed in Sect. 1, O 2 b(0-1) at 0.865 µm is the most popular non-OH molecular band for T rot retrievals in the mesopause region. It can also be measured in the X-shooter spectra. With a minimum resolving power of 5400 in the Figure 3. Median X-shooter spectrum of O2b(0-1) for the 189 VISarm sample spectra taken with the standard slit width of 0.9 ′′ . The original spectrum was corrected for solar absorption lines in the continuum (green areas). Pairs of P Q and P P lines with even upper rotational levels N ′ from 0 to 18 are marked by circles. Note that P Q (0) does not exist. The filled symbols indicate lines that were used for the derivation of Trot. The ordinate displays the natural logarithm of the HITRAN-based population of a hyperfine-structure level relative to 1 Rayleigh second. Resulting from P Q and P P line intensities, the mean populations of the even rotational levels with N ′ from 0 to 18 for the full X-shooter sample are shown. Filled symbols mark reliable data points that were used for the derivation of Trot (see Fig. 3), which is illustrated by the solid regression line.
VIS arm, the individual lines of O 2 b(0-1) are not separated. However, in the P branch, blended lines originate from the same upper rotational level of b 1 Σ + g (v = 0) (see Fig. 3). This allows T rot to be directly measured without the requirement for T rot -dependent synthetic spectra and data fit-ting, which is the standard approach (e.g. Scheer and Reisin, 2001;Shiokawa et al., 2007;Khomich et al., 2008). Consequently, the systematic uncertainties of the T rot measurement are expected to be significantly lower. The two blended subbranches are P Q and P P (e.g. Gamache et al., 1998), where the rotational level N increases by 1 and the total angular momentum J changes by 0 or +1 depending on the spin orientations in the lower state X 3 Σ − g (v = 1). Due to the symmetry of the O 2 molecule, the upper state rotational levels N ′ have to be even. The selection rules also forbid P Q (0), i.e. there is only a single P -branch line for N ′ = 0.
For the T rot determination, the intensities of the P -branch line pairs with N ′ from 0 to 18 were measured (see Fig. 3). The lines with N ′ higher than 18 are already relatively faint and strongly blended with OH lines. Before the line intensities could be determined, the continuum had to be measured and subtracted. Since an important part of the continuum is caused by zodiacal light (sunlight scattered by interplanetary dust particles) and/or scattered moonlight (e.g. Noll et al., 2012), the strong solar Ca + absorption line at 866.2 nm is critical for the continuum window between the N ′ = 6 and 8 lines (see Fig. 3). For this reason, the solar flux atlas of Wallace et al. (2011) was used to calculate a continuum correction, which required the adaption of the solar transmission to the corresponding resolution of the X-shooter spectra and the determination of the variable fractional contribution of a solar-type spectrum to the continuum at O 2 b(0-1). The latter was derived from the depth of the Ca + absorption in the X-shooter data compared to the solar spectrum and is important since there is also an airglow continuum related to reactions involving NO (Khomich et al., 2008;Noll et al., 2012;Semenov et al., 2014). After the removal of the solar lines, the median continuum level was measured in nine fixed windows, which were optimised for the setups with the lowest spectral resolution, i.e. a slit width of 1.5 ′′ . The range between the N ′ = 2 and 6 lines could not be used for the continuum determination due to significant contributions by OH lines. The resulting continuum fluxes were then linearly interpolated and subtracted from the X-shooter spectra. Finally, the intensities of the line pairs were integrated within limits which were adapted to the spectral resolution. Since the P branch of O 2 b(0-1) is not affected by molecular absorption (cf. Sect. 3.1.1), the corresponding correction could be neglected. As the observations were carried out at different zenith distances, the intensities were corrected to be representative of the zenith by applying the equation of van Rhijn (1921) (see Noll et al., 2015).
The resulting sample mean populations of the rotational levels N ′ = 0 to 18 are shown in Fig. 4. Similar to OH described in Sect. 3.1.1 and Noll et al. (2015), they correspond to I/(g ′ A), where I and A are sums for the line pairs. The molecular parameters were taken from HI-TRAN2012 (Rothman et al., 2013), which is essentially based on Gamache et al. (1998) for O 2 b(0-1). Most y values show a good correlation with E ′ . The outliers can be explained by contamination from OH lines. This result supports the assumption that non-LTE effects are not relevant for the rotational level population. The radiative lifetime of about 210 s derived from the HITRAN data appears to be sufficiently long.
The optimal line set for the calculation of T rot turned out to consist of P P (0) and the P -branch line pairs with N ′ from 4 to 14. For the mean populations shown in Fig. 4, the corresponding systematic T rot uncertainty is only 1.8 K. This line set could be applied to 326 spectra. Due to instrumental issues or contamination by astronomical objects, 15 fits were significantly better by using only six instead of seven lines/line pairs. In two cases, only five emission features were considered. The optimal line sets were determined by an automatic iterative procedure. In order to avoid systematic differences depending on the line set, the T rot based on less than seven data points were corrected. For this, the reliable spectra were used to derive the differences between the T rot for the full and each reduced line set. The mean amount of the correction was 0.5 K.
We applied further corrections to the measured T rot . First, we investigated possible systematic errors related to the widths of our continuum windows, which also had to be suitable for the low-resolution data. For 23 reliable highresolution spectra taken with the 0.7 ′′ slit, we could significantly increase the number of continuum pixels. The resulting temperatures were about 0.4 K lower than those obtained based on the standard windows. As a consequence, we decreased the T rot of the whole sample by the same amount. Second, we corrected the influence of the slit width on T rot by adapting the resolution of the 23 0.7 ′′ spectra to those taken with a 0.9 ′′ , 1.2 ′′ , or 1.5 ′′ slit and deriving the T rot differences. This resulted in temperature increases of about 0.2, 0.4, and 1.0 K, respectively. All these corrections are relatively small, which makes us confident that our T rot for O 2 b(0-1) are robust.

O 2 a(0-0)
The X-shooter NIR arm covers the strong O 2 a(0-0) band at 1.27 µm (see Fig. 5), which contains nine branches related to the dominating magnetic dipole transitions (e.g. Lafferty et al., 1998;Newman et al., 1999;Rousselot et al., 2000;Leshchishina et al., 2010). With the minimum resolving power of 3300 in the X-shooter NIR arm, the lines of two branches are sufficiently separated to allow individual intensity measurements. This could not be performed by Mulligan and Galligan (1995), who have done the only known O 2 a(0-0) T rot measurements (see Sect. 1). The suitable branches are S R and O P (see Fig. 5), which have the same upper levels but differ by a change of N by −2 and +2, respectively. To fulfil the selection rules for ∆J, these ∆N have to be accompanied by an opposite change of the spin ∆S by 1. Only transitions with odd N ′ originate from a 1 ∆ g . For the line measurements, we focused on interme- Figure 5. Median X-shooter spectrum of O2a(0-0) for the 177 NIRarm sample spectra taken with the standard slit width of 0.9 ′′ . S R and O P lines with odd upper rotational levels N ′ from 13 to 23 are marked by circles and diamonds, respectively. The filled symbols indicate lines that were used for the derivation of Trot. Resulting from S R (circles) and O P (diamonds) line intensities, the mean populations of the odd rotational levels with N ′ from 13 to 23 for the full X-shooter sample are shown. Filled symbols mark reliable data points that were used for the derivation of Trot (see Fig. 5), which is illustrated by the solid regression line. diate N ′ between 13 and 23, which are sufficiently strong and not too close to the highly blended branches in the band centre. The line intensities were measured in a similar way as described in Sect. 3.1.2 for O 2 b(0-1). The effect of solar lines on the continuum could be neglected. However, since the lower state is the ground state, the resulting line intensities had to be corrected for the strong self-absorption at low altitudes. The procedure was similar to the approach for OH described in Sect. 3.1.1 and Noll et al. (2015). Since the transmission curves for Cerro Paranal with a resolving power of about 1 × 10 6 calculated by Noll et al. . O2a(0-0) intensity relative to mean after midnight as a function of the time since sunset for the X-shooter sample (dots). The solid line shows the best fit of a constant and an exponential function to the data. The plot also displays the corresponding fit results for the SABER sample without longitude restriction and lower intensity integration limits of 40 (dashed) and 80 km (dash dotted). For the three fits, the resulting radiative lifetimes τ of the daytime population are shown in the legend.
(2012) were not accurate enough for O 2 a(0-0), we recalculated the O 2 absorption with the maximum resolving power of 4 × 10 6 inherent to the radiative transfer code LBLRTM (Clough et al., 2005). For a nominal temperature of 190 K for the O 2 a(0-0) emission layer, we obtained zenith transmissions between 0.14 and 0.80 for the measured S R lines and between 0.57 and 0.96 for the corresponding O P lines. The use of a fixed temperature for the Doppler broadening calculation has only a minor impact on the resulting T rot . We performed a second iteration of the transmission correction using the measured T rot . The mean T rot change compared to the first iteration was only about 0.1 K. Figure 6 shows the sample mean populations for the upper states of the measured O 2 a(0-0) lines. The populations were calculated by means of the molecular parameters of the HITRAN2012 database (Rothman et al., 2013), which are based on several studies (Gamache et al., 1998;Newman et al., 1999;Washenfelder et al., 2006;Leshchishina et al., 2010). Most lines indicate a tight linear relation of y and E ′ . The good agreement of the populations derived from S R and O P lines with the same N ′ is also convincing. The few outliers can be explained by blends with O 2 lines from other branches and OH lines. This result supports the expected good thermalisation of the rotational level populations of a 1 ∆ g (v = 0) due to the long radiative lifetime. The HITRAN data indicate an Einstein-A coefficient of 2.28 × 10 −4 s −1 (cf. Sect. 1), which corresponds to τ ≈ 73 min.
We can directly constrain τ from our data. Since the daytime production of a 1 ∆ g by ozone photolysis stops at sunset, the subsequent decrease of the O 2 a(0-0) intensity is an indicator of τ (e.g. Mulligan and Galligan, 1995;Pendleton et al., 1996;Nair and Yee, 2009). Since there are also a 1 ∆ g losses by collisions, the derived lifetime is only a lower limit. Figure 7 shows the O 2 a(0-0) intensities normalised to the mean after midnight as a function of the time after sunset for our X-shooter data set. The steep intensity decrease at the beginning of the night is evident. We fit this decrease by an exponential function plus 1. The latter reflects the population related to the nighttime a 1 ∆ g excitation, which is considered to be constant. This assumption is reasonable since the average nighttime variations are expected to be much weaker than those of the daytime-related population. The resulting lower lifetime limit is 46.1 ± 1.6 min, which is within the range of other ground-based measurements (Mulligan and Galligan, 1995;Pendleton et al., 1996). We can also derive τ from our SABER data set (Sect. 2.2). For the latitude range from 23 • to 26 • S, z ⊙ greater than 100 • , and a minimum height of 40 km, we obtained about 48 min, which is close to the X-shooter value. The SABER best-fit curve in Fig. 7 is shifted to earlier times due to the sparse twilight sampling of the X-shooter data with a minimum z ⊙ of 105 • . This could also contribute to the small difference of the τ values. As illustrated by Fig. 1, the decay of the daytime population is faster at lower altitudes, where collisions are more frequent. Therefore, the resulting τ depends on the considered altitude range. For a lower limit of 80 km, we obtained about 60 min (see Fig. 7). Also based on SABER data, Nair and Yee (2009) derived an Einstein-A coefficient consistent with the theoretical value for a height of 85 km.
As shown by Figs. 5 and 6, the optimal line set for the T rot retrieval consists of O P (13), S R (15), O P (15), S R (17), O P (17), and S R (21). In principle, the two lines with N ′ = 23 could also have been included since the T rot derived from the mean populations in Fig. 6 only differs by 0.1 K from the result for our standard line set. However, they are relatively faint, which causes significant uncertainties if the T rot are obtained from individual spectra. Using the mean populations and the extended line set, the systematic T rot uncertainty from the regression analysis is about 2.1 K.
The resulting T rot were corrected for different systematic effects. For this purpose, the 47 spectra taken with the narrow 0.4 ′′ and 0.6 ′′ slits were used. First, we checked the quality of the crucial O 2 self-absorption correction. This was achieved by multiplying the optical depth of all lines by a line-independent but variable factor and recalculating T rot to find the results with the lowest regression uncertainties. For the 47 test spectra, this procedure resulted in an optimal factor of 1.03 ± 0.03 and a T rot decrease by 1.1 ± 0.8 K. This small correction confirms the reliability of our radiative transfer calculations. Then, similar corrections as for O 2 b(0-1) (Sect. 3.1.2) were performed. With continuum windows optimised for the narrow slits instead of the wide 1.5 ′′ slit, we obtained a ∆T rot of −0.3 K. The effect of the spectral resolution on the line measurements resulted in T rot corrections of +1.2, +0.4, and +1.4 K for the 0.9 ′′ , 1.2 ′′ , and 1.5 ′′ slits. Taking the 0.6 ′′ slit as a reference, the T rot of the three spectra obtained with a 0.4 ′′ slit were also corrected and increased by 3.5 K. This relatively large correction is probably related to the starting resolution of underlying faint lines. In general, the applied corrections are small and partly cancel each other out. The systematic T rot uncertainty slightly increases from 2.1 to 2.2 K when the uncertainties of the T rot corrections are also considered.

Emission profiles
In order to compare the T rot derived from the discussed bands (Sect. 3.1), the differing height distributions of the emission have to be considered since the measured bands probe different parts of the mesopause temperature profile. Since the ground-based X-shooter data do not provide this information, we obtained VER and temperature profiles representative of Cerro Paranal from the satellite-based SABER data (Russell et al., 1999).
It is reasonable to assume that the v ′ -dependent emission peak altitudes are equidistant (cf. von Savigny et al., 2012) (see also Sect. 3.4). Even if there are significant deviations from this assumption, the maximum errors should only be of the order of a few 100 m, which is still relatively small compared to the typical FWHM of 8 to 9 km of the OH VER profiles (e.g. Baker and Stair, 1988). Then, the v ′ -dependent emission peak altitudes can be derived by means of effective v ′ for the two SABER channels. We achieved this by calculating OH line spectra from the vibrational level populations and T rot given by Noll et al. (2015) for HITRAN data and convolving them with the SABER filter curves (Baker et al., 2007). In this way, we obtained fractional contributions of the different OH bands and, finally, the effective v ′ , which are 4.57 and 8.29 for the 1.64 and 2.06 µm channels, respectively. Although we neglected the higher relative populations at high N ′ compared to those related to T rot (e.g. Cosby and Slanger, 2007) and did not consider the variability of the relative populations , the resulting values are sufficiently robust for our purpose. The corresponding altitude uncertainty should be well below 100 m. From ∆v ′ = 3.72 and the emission peak altitudes h peak of both OH channels, we derived a mean altitude difference of 0.37 km for ∆v ′ = 1 and our Cerro Paranal SABER sample consisting of 1685 measurements (Sect. 2.2). This is in good agreement with data from other studies (von Savigny et al., 2012). Since h peak is not very accurate due to a step size of 0.2 km (Sect. 2.2), we used the centre of the range limited by the interpolated half maximum positions h cen as a more robust measure to calculate the v ′ -dependent profile altitudes. In this case, the mean ∆v ′ = 1 altitude difference is 0.39 km with a standard deviation σ of 0.15 km.
The VER profiles of the two OH channels (see Fig. 8) show slightly different shapes. For the Cerro Paranal SABER sample, the mean FWHM are 8.1 and 9.1 km for the 1.64 and 2.06 µm channels, respectively. The corrresponding σ are 1.9 and 2.1 km. For a better comparability of the v ′ -dependent profiles and the related temperatures (see Sect. 3.3), we used merged profiles. For all SABER measurements, we shifted both profiles normalised to the same integrated intensity to the calculated h cen for each v ′ and averaged them. The resulting mean profiles for the Cerro Paranal SABER sample are shown in Fig. 8. They have a FWHM of 8.6 km with a σ of 1.9 km. The average h cen range from 86.2 km (v ′ = 2) to 89.0 km (v ′ = 9) with a scatter between 1.3 and 1.4 km. The profiles are asymmetric. The mean h cen are about 0.6 km (σ = 1.1 km) higher than the corresponding h peak . For the effective height h eff , i.e. the VER-weighted altitude, the average difference is 1.2 km. The calculation of h eff excluded al-titudes with negative VERs and |h − h cen | > 15 km (see also Sect. 3.3).

O 2 b(0-1)
SABER does not provide VER profiles for O 2 b(0-1). For this reason, we assumed a simple Gaussian profile, which appears to be justified by the relatively small measured asymmetries (see Khomich et al., 2008). Based on values from the literature (see Sect. 1), where the measurements are usually based on O 2 b(0-0) (e.g. Yee et al., 1997), peak height and FWHM were set to 94.5 and 9 km (see Fig. 8). Since our O 2 b(0-1) profile is fixed, we checked the influence of variations of h peak and FWHM on the derived temperatures (see Sect. 3.3). For this analysis, we assumed the same variations of these parameters as described in Sect. 3.2.1 for OH. As a result, we found for individual profiles an average σ of 2.0 K. However, for the sample mean (where transient waves vanish), σ is only 0.5 K. Since we are mainly interested in average properties, the exact choice of h peak and FWHM for O 2 b(0-1) is not critical for our study.

O 2 a(0-0)
SABER has a dedicated channel for O 2 a(0-0) (see Sect. 2.2). This was one of the reasons for selecting SABER for this study since this band shows strong variations of the emission profile. In Fig. 1, this was illustrated for twilight conditions. As our X-shooter data set is a nighttime sample, we show the average emission profiles for the Cerro Paranal SABER sample at five different time intervals in Fig. 9. For all periods, most of the emission originates close to the OH emission layer (see also Fig. 8), which is crucial for a reliable T rot comparison (see Sect. 3.3). However, the emission peak indicates a clear transition from about 84 to 90 km in the first half of the night. The early-night peak is obviously caused by a daytime ozone (and its photolysis) maximum in the upper mesosphere, whereas the late-night peak appears to be related to atomic oxygen recombination and subsequent collisions (e.g. López-González et al., 1989;Mulligan and Galligan, 1995;Kaufmann et al., 2003) (see also Sect. 1).
The nocturnal variations of O 2 a(0-0) are illustrated in Fig. 10. Here, the effective heights h eff are given compared with those of OH(v ′ = 8). The calculation of the O 2 a(0-0) h eff excluded altitudes with negative VERs, which essentially appear in the lower mesosphere (see Fig. 9). Additional height limits as for OH (see Sect. 3.2.1) were not applied because of the strong profile variations. The general profile limits of 40 and 110 km (Sect. 2.2) are not an issue for the nighttime. Figure 10 shows a relatively large scatter of 2.4 km in the O 2 a(0-0) h eff at the beginning of the night, which is not seen for OH(v ′ = 8) (σ = 1.1 km). In the course of the night, the deviation (4.4 km) and scatter become smaller until a convincing linear relation with OH(v ′ = 8) is established. The correlation coefficient for the last period is r = 0.62,  which is significantly higher than r = 0.35 for the first period. The increase of h eff for O 2 a(0-0) is partly compensated by a simultaneous rise of the OH layer of 1.1 km between the first and last period. The smallest discrepancy and scatter of 0.2 and 1.1 km for O 2 a(0-0) are found for the fourth nighttime interval. The result for the discrepancy will change with a different OH(v ′ ). The analysis of the h eff data con-firms our expectation that an OH and O 2 a(0-0) T rot comparison is possible with only minor corrections for a wide range of nighttimes (see Sect. 3.3). A correction is not negligible even for equal h eff since the profile shapes and FWHM can be very different (see Fig. 8). The average emission profile FWHM for O 2 a(0-0) and OH are 12.9 and 8.7 km, respectively. There is still a difference of 3.6 km (corresponding to 11.7 and 8.0 km) for the fourth nighttime period, which shows the best agreement in terms of h eff . The emission profile widths for O 2 a(0-0) and OH decrease during the night. The difference is 4.6 and 2.3 km, respectively, if the FWHM of the first and last period are compared.
The emission profiles of O 2 a(0-0) and the negligible non-LTE contributions to the corresponding T rot make this band ideal for an analysis of the OH-related T rot . However, the strong profile variations do not allow one to assign a reference altitude for the O 2 a(0-0)-related temperatures. Therefore, we also introduce a reference profile, which is based on O 2 a(0-0) profiles observed in the second half of the night. Since the corresponding median h peak and FWHM are 89.8 and 10.9 km, respectively, we selected 90.0 and 11.0 km as reference values. Observed SABER VER profiles with similar properties were then averaged. As the Cerro Paranal SABER sample is too small for this, we used the larger sample without longitude restriction (see Sect. 2.2). We considered all 297 profiles with h peak between 88 and 90 km and FWHM between 10.8 and 11.2 km. To avoid any broadening of the averaged profile, the limits for the FWHM were more restrictive. In the case of h peak , this effect could be prevented by shifting the profiles to the reference height before the averaging. Figures 8 and 9 show the resulting reference profile, which is not significantly affected by the decaying daytime a 1 ∆ g population. The profile is slightly asymmetric with h cen = 90.1 km and h eff = 90.4 km.

Temperature corrections
The final step of the preparation of T rot data for a comparison is the correction of differences in the emission profiles. This requires some knowledge of the true temperature profile in the mesopause region. SABER provides this kind of data (T kin ), which are based on CO 2 -related measurements and non-LTE radiative transfer calculations (see Sect. 2.2). For the derivation of the sample-averaged temperature differences for slightly deviating emission profiles, the T kin uncertainties are probably small. However, we also want to directly compare X-shooter T rot and SABER T kin . For this reason, the temperature uncertainties have to be known in detail. According to Rezac et al. (2015) and J. M. Russell III (personal communication, 2015), the total uncertainties for single profiles are about 5 K at 90 km and tend to increase with altitude. Since we are interested in average properties, we can neglect the statistical errors, which leaves systematic uncertainties of 3 to 4 K in the relevant altitude range. This may further decrease if it is considered that systematic er-rors can also partly cancel out (Rezac et al., 2015). For this reason, we assume a relevant uncertainty of 2.5 K as input for Sect. 4. As an additional check, we compared the preferred version 2.0 data to those of version 1.07, which is possible for SABER data taken until the end of 2012. The old T kin show deviations from the new ones, which are between +2.0 K for OH(v ′ = 2) and −0.2 K for O 2 b(0-1). Assuming that the new T kin are more correct, the changes are consistent with the assumed errors.
To obtain effective temperatures T eff from the SABER T kin for the different bands, we used a similar procedure as described for h eff (Sects. 3.2.1 and 3.2.3), i.e. the T kin of the different altitudes were weighted depending on the related VERs for each emission layer. Altitudes with negative VERs were rejected. Moreover, weights were set to zero for altitude distances of more than 15 km from the OH h cen .
The difference between the resulting T eff of a given band and a desired reference band ∆T eff indicates the change of the effective true temperature by deviating altitudes, widths, and shapes of the corresponding emissions profiles. For the derivation of possible non-LTE effects, these ∆T eff have to be subtracted from the measured T rot . The corrections cannot directly be applied since both quantities originate from different data sets, i.e the Cerro Paranal SABER sample with 1685 profiles (Sect. 2.2) and the X-shooter sample with 343 spectra (Sect. 2.1). For this reason, we needed to obtain SABERspecific parameters such as T eff and h eff for each X-shooter spectrum. We achieved this by smoothing the SABER data in the two dimensions day of year (DOY) and time. As a smoothing filter, we used a 2-D Gaussian with σ of half a month (15.2 days) and half an hour. The Gaussians were calculated across the New Year boundary if required. For each X-shooter data point, we could then compute weights for the different SABER data points and, finally, weighted averages for the SABER-related parameters.
A suitable smoothing procedure is crucial since both data sets have large gaps. In particular, the SABER data for our selected area around Cerro Paranal only cover a very narrow range of times for a certain DOY (see Sect. 2.2). It is desirable to average as many SABER data as possible to obtain reliable mean values. However, if this requires to add data taken at very different conditions, this might cancel out variability patterns that are required for realistic corrections. The SABER data set is already a compromise since it includes data that were taken several 100 km away from Cerro Paranal. Hence, we checked the influence of the smoothing procedure on the resulting weighted parameters. Focusing on the X-shooter sample mean T eff for the eight OH, two O 2 , and one reference emission layers (see Sect. 3.2), we found a mean change of about 0.1 K for a bisection of the Gaussian σ. A doubling of the limiting periods resulted in a corresponding mean shift of about 0.2 K. Moreover, a different smoothing function (e.g. an exponential function with a flat top for distances smaller than σ) had little influence on the results. Finally, the introduction of the solar activity (mea-sured by the solar radio flux) as a third dimension did not significantly change the resulting T eff either. Therefore, we conclude that the details of the smoothing procedure for the SABER data are not critical for our analysis.
The main contributions to the weighted parameter averages mainly originate from a relatively small fraction of the SABER data. Within the 1σ perimeter of the 2-D Gaussian around the Cerro Paranal data points, the number of SABER profiles ranges from 0 to 41 for our applied procedure. The mean is 12. If a weight sum is used (defining weight = 1 in the centre of the Gaussian), values between 4 and 49 with a mean of 23 are found. These numbers suggest that profileto-profile variations limit the accuracy of the resulting averages. This is confirmed by typical T eff mean errors between 1.3 K (O 2 a(0-0)) and 1.9 K (O 2 b(0-1)) for individual Xshooter observations. These uncertainties are comparable to the systematic T rot measurement errors (see Sect. 3.1). However, for sample averages, the statistical T eff -related errors strongly decrease. The full sample mean errors are only about 0.1 K. The uncertainties are further reduced if the ∆T eff for the T rot correction are taken into account. In this case, the individual mean errors range from 0.6 K (O 2 a(0-0)) to 1.3 K (O 2 b(0-1)) for the reference profile at 90 km described in Sect. 3.2.3. The uncertainties for the sample mean are negligible.
The most critical source of error is probably a significant deviation of the T eff climatologies from those related to the Cerro Paranal X-shooter data. This is relevant since the set of SABER profiles used covers a relatively wide area comprising 3 • in latitude and 20 • in longitude (see Sect. 2.2). Moreover, there are large gaps in the local time coverage for a given month. In order to investigate this, we used the five nighttime periods first introduced in Fig. 9. For the OH(v ′ ) and two O 2 bands, we then measured the sample-averaged difference of the X-shooter-related T rot and the SABER-related T eff for each time interval. Finally, mean values and standard deviations were calculated for the five data points. This provides us with a rough estimate of the agreement of the X-shooter and SABER nocturnal variability patterns. The resulting standard deviations range from 0.7 K for OH(v ′ = 9) to 2.1 K for OH(v ′ = 2). The average for all emission layers is 1.5 K. We can repeat this calculation by taking the temperature differences for two bands, which is needed for the emission profile correction. For the emission of O 2 a(0-0) as reference profile, the standard deviations now range from 0.2 K for OH(v ′ = 3) to 3.2 K for O 2 b(0-1), which reflects the relatively low effective height of the reference layer. Our preferred reference profile at 90 km cannot be used in this context since T rot measurements do not exist. For this reason, we calculated the standard deviations for all possible band combinations and performed a regression analysis depending on the related height differences ∆h eff . Then, the climatology-related uncertainties can be estimated only based on differences of emission altitudes. This approach assumes that h eff is the major driver for the changes in the variability pattern, which is likely since the widths of the emission layers do not differ a lot. Moreover, a linear change of the variability with ∆h eff is required, which is more difficult to fulfil if the height differences are relatively large. Significant variations of the OH non-LTE effects could influence the fit (see Sect. 4). For our X-shooter sample, we obtain a slope of 0.37 K km −1 and an offset of 0.47 K. The latter includes influences like FWHM differences. The fit is robust since the uncertainties of both values are lower by more than an order of magnitude. For a change from a bandspecific emission profile to our reference profile, the linear fit results in systematic errors in the temperature correction by the different climatologies for the SABER and X-shooter data between 0.7 K for OH(v ′ = 9) and 2.0 K for O 2 b(0-1). For the discussion of seasonal variations in Sect. 4.3, we also performed the whole analysis by using double month periods, which have a similar length as the TIMED yaw cycle of about 60 days (Russell et al., 1999). Independent of the month grouping starting with December/January or January/February, we obtained slightly higher errors between 1.1 K for OH(v ′ = 9) and 2.3 K for O 2 b(0-1). Our estimates might be lower than the real systematic uncertainties due to the low bin number, which is limited by the small data sets and gaps in the time coverage (cf. Sect. 4.3).
The relatively large distances between Cerro Paranal and the tangent points where the SABER profiles were taken are an important reason for the deviating variability. With the approach described above, we can check the robustness of our area selection limits. Focusing on the |λ − λ CP | ≤ 10 • criterion (Sect. 2.2), we also studied 5 • and 20 • limits. For the correction of the differences between the emission profiles of the measured bands and the reference profile, we find average temperature uncertainties of 1.2, 1.3, and 1.4 K for the 5 • , 10 • , and 20 • limits, respectively. The averages were derived from the results for the eight OH and two O 2 emission layers. The errors show the expected increase with growing longitude interval. However, the changes are small enough that the selection of the limit does not need to be very accurate. A tight longitude range is not recommended since this would significantly shrink the SABER sample and, hence, increase the statistical errors.

Emission heights and temperature correlations
In the same way as described in Sect. 3.3 for T eff , the bandspecific h eff (Sect. 3.2.1) were derived for each X-shooter observation. It is possible to check the reliability of these data by comparing average h eff with correlation coefficients r for the correlation of the corresponding T rot with the T rot for a reference band derived from the same data set (Sect. 3.1). Assuming a gradual change of the temperature variability pattern with height, the T rot -based r should be a function of the altitude difference of the emission layers used for the correlation analysis. The required altitude-dependent temperature variations can be caused by waves like tides (e.g. Figure 11. Effective emission height heff in km from SABER data vs. the correlation coefficient r for the correlation of the Trot (filled symbols), Teff for the full X-shooter sample (open symbols), and Teff for the Cerro Paranal SABER sample (crosses) of OH(v ′ ) and O2a(0-0) with those of O2b(0-1). The dash-dotted (Trot, X-shooter sample), long-dashed (Teff, X-shooter sample), and dotted line (Teff, SABER sample) show fits of the OH data, which are connected by short-dashed lines. Marsh et al., 2006). A complication could be v ′ -dependent variations of the non-LTE contributions to the measured OH T rot (see Sect. 4). As r is sensitive to measurement uncertainties, we only selected the most reliable OH band for each v ′ , i.e. we used (2-0), (3-1), (4-2), (5-3), (6-2), (7-4), (8-5), and (9-7). The results for the latter band are only based on 240 instead of 343 spectra due to a lack of wavelength coverage and issues with the flux calibration (see Sect. 2.1). Figure 11 shows the mean h eff vs. r for the X-shooter sample with O 2 b(0-1) as a reference band (r = 1). There is a striking linear relation for the OH data, which supports our assumption of equidistant emission layers (Sect. 3.2.1). The slope of a linear fit is 7.6 ± 0.4 km. The deviation of OH(9-7) may be explained by the different sample and the measurement uncertainties. O 2 a(0-0) is very close to the fit and best agrees with OH(6-2). This is another argument for using this band for temperature comparisons with OH data. O 2 b(0-1) deviates by about 2 K from the extrapolated fit, which is most probably caused by a non-linear relation of h eff and r for the large emission height differences. The mesopause temperature minimum, which is close to the O 2 b(0-1) emission layer (see Sect. 4), might have an impact. Our interpretations are supported by the r that we derived from the SABER-related T eff for the X-shooter sample (Fig. 11). There is an almost perfect linear relation of h eff and r for these data, which is in good agreement with the found equidistance of the OH lay-ers and the position of O 2 a(0-0) in relation to the OH bands. The position of O 2 b(0-1) deviates in a similar way from a linear fit of the OH data with a slope of 16.0 ± 0.4 km. The correlations for T eff are significantly stronger than for T rot . The relatively low signal-to-noise ratio of the O 2 b(0-1) data compared to those of the other bands can have a small impact. However, a direct correlation for e.g. v ′ = 3 and 8 revealed r = 0.86 for T rot and 0.97 for T eff . Possible variations of the OH T rot non-LTE contributions do not appear to be sufficiently strong since the r difference for O 2 a(0-0) is in good agreement with those of OH (cf. Sect. 4.2). Therefore, the best explanation for the r discrepancies is the weighted averaging of the SABER measurements to derive representative values for the X-shooter data set (Sect. 3.3). This effect can be estimated by calculating r for the T eff of the original Cerro Paranal SABER sample with 1685 profiles (Sect. 2.2). As shown in Fig. 11, the OH r values for these data with a slope of 10.2 ± 0.1 km are very close to those for T rot , i.e. the r differences are indeed related to the averaging procedure. The position of O 2 a(0-0) deviates from the trend for OH, which is probably caused by differences in the X-shooter and SABER sample properties. The latter also contributes to the remaining differences between the OH data points. Moreover, the limited vertical resolution of the SABER profiles can cause higher r for T eff than for T rot . The sampling of about 0.4 km (Sect. 2.2) is similar to the height difference of adjacent OH(v ′ ) emission layers (Sect. 3.2.1). Therefore, small-scale variations cannot be probed by SABER.

Results and discussion
After the derivation of T rot corrected for the emission layer differences (Sect. 3), we can finally compare these temperatures and quantify the OH-related non-LTE effects depending on v ′ and time. Figure 12 gives an overview of the mean T rot and their uncertainties for the eight OH(v ′ ), O 2 b(0-1), and O 2 a(0-0) derived from the full X-shooter sample of 343 spectra (see Sect. 3.1). The T rot are plotted in comparison to the corresponding SABER-based h eff (Sect. 3.2) projected onto the X-shooter sample (Sect. 3.3). The FWHM of the emission profiles are indicated by vertical bars. The average O 2 a(0-0) profile is the widest with about 12.7 km. The figure shows a large overlap of the OH and O 2 a(0-0) emission profiles, as already demonstrated by Fig. 8. However, there are large differences in the measured OH T rot , which range from 188.1 K for v ′ = 2 to 202.5 K for v ′ = 8. As discussed in Noll et al. (2015) and Sect. 1, the maximum at v ′ = 8 is typical of the T rot (v ′ ) pattern and can be explained by the contribution of a nearly nascent population, which shows a particularly hot rotational level population distribution for this v ′ . The mean Figure 12. Comparison of temperatures in K and the corresponding heights in km for the full X-shooter sample. Solid symbols show X-shooter Trot measurements and their uncertainties (excluding errors related to the molecular parameters where HITRAN was used) for OH from v ′ = 2 to 9 (cf. Fig. 11) using the first three P1 lines (circles), O2a(0-0) (diamond), and O2b(0-1) (square). The effective emission heights were derived by means of SABER data. The bars in height direction mark the profile width at half maximum. The cross is related to the SABER Tkin for the O2a(0-0)-based reference profile. Open symbols indicate the Tkin results for the emission profiles of the bands measured by X-shooter. For these data, we assume temperature uncertainties of 2.5 K. The solid curve shows the mean Tkin profile for the X-shooter sample.

Temperature comparison for reference profile
T rot of O 2 a(0-0) is at the lower end of the OH T rot distribution with 191.2 ± 2.2 K. The O 2 b(0-1) T rot of 184.3 ± 1.8 K is even lower. These temperatures can be compared to the SABER-based T eff for the X-shooter sample with an estimated uncertainty of about 2.5 K (Sect. 3.3). While the T rot and T eff for each O 2 band agree within the errors, this is not the case for all the OH bands. The v ′ dependence of the temperature is also very different. The OH T eff only range from 190.5 to 190.7 K, i.e. the temperature gradient is almost zero. In Fig. 12, we also display the mean T kin profile calculated for the X-shooter sample. Indeed, the mean temperature profile is relatively flat in the altitude range of the OH emission. Together with the large profile overlap, this explains the almost absent temperature gradient. This implies that the contribution of real temperature variations to the characteristic T rot (v ′ ) pattern is small. Non-LTE effects have to dominate. Only the relatively low T rot of O 2 b(0-1) can be explained by the proximity of the mesopause temperature minimum. Figure 12 also shows the h eff , FWHM, and T eff of the O 2 a(0-0)-based reference profile introduced in Sect. 3.2.3. T eff is 189.2 ± 2.5 K. The different T rot can be corrected to Figure 13. Comparison of mean temperatures in K for the O2a(0-0)-based reference profile and the full X-shooter sample.
The v ′ -dependent OH Trot (circles) are based on measurements of the first three P1 lines and HITRAN-based molecular parameters . The systematic errors related to the latter are not included in the plotted error bars. HITRAN-based molecular parameters were also used for the derivation of the Trot from O2a(0-0) (diamond) and O2b(0-1) (square). The cross shows the CO2-based SABER Tkin for the reference profile and the full Xshooter sample. The mean and the uncertainties derived from the three non-OH temperatures are marked by horizontal lines. be representative of this profile that peaks at 90 km and has a FWHM of 11 km. This can be done by subtracting the difference between the T eff related to a given band and the reference profile from the T rot for the selected band (see Sect. 3.3). The resulting temperatures are shown in Fig. 13. For O 2 a(0-0) and O 2 b(0-1), we obtained 188.6 ± 2.5 K and 186.3 ± 2.7 K, respectively. The uncertainties are a combination of the measurement uncertainties already shown in Fig. 12 and the temperature correction errors discussed in Sect. 3.3. The two O 2 -related temperatures and the CO 2based SABER T eff agree very well within their errors. Since these three temperatures are not affected by non-LTE effects, we can combine them, which results in an average temperature of 188.0 ± 1.6 K. Clemesha et al. (2011) published a mean temperature profile based on sodium lidar measurements carried out at São José dos Campos in Brazil (23 • S, 46 • W) from 2007 to 2010. For our reference emission and their temperature profile, this implies a T eff of 188 K, which is in remarkable agreement with our measurement despite their assumed absolute errors of about 5 K, differences in the time coverage (no January data), and the small but probably non-negligible differences in latitude and longitude. For the latter effect, see Fig. 2. The upper part of the lidar-based temperature profile is also in good agreement with the SABER one in Fig. 12. The lower part shows some discrepancies due to a different height and strength of the secondary minimum, which is at about 88 km. In the SABER case, there  van der Loo andGroenenboom (2007, 2008).
is only a nearly constant temperature down to about 81 km, which causes a lower onset of the steep temperature increase that is already visible in the Clemesha et al. (2011) profile above 80 km. The comparison with satellite and lidar data shows that the T rot from O 2 a(0-0) and O 2 b(0-1) line measurements in X-shooter spectra provide reliable temperatures in the mesopause region.
The OH T rot indicate significant non-LTE contributions in comparison to our LTE reference temperature of about 188 K. The temperatures range from 186.7±2.7 K for v ′ = 2 to 201.2 ± 1.3 K for v ′ = 8. In the latter case, the difference is about 13 K. With the estimated errors, the ∆T non-LTE of v ′ = 8, 9, and 6 are highly significant (> 3σ). The situation for v ′ = 2 and 3 is unclear. The OH bands (2-0) and (3-0), which by far show the lowest T rot , are also affected by relatively large systematic uncertainties. OH(2-0) is partly absorbed by water vapour (more than any other band considered) and OH(3-0) is partly affected by line blending and close to the upper wavelength limit of the Xshooter VIS arm. Therefore, it cannot be excluded that the true temperatures of v ′ = 2 and 3 are similar to those of higher v ′ . The plotted temperatures are only valid if the T rot are derived from the first three P 1 -branch lines. For example, the resulting ∆T non-LTE and the significance of the non-LTE effects would increase if P 1 (4) was also used. In this case, the T rot would be higher by about 1 K for all v ′ . The change to the first four P 2 -branch lines would even increase the OH T rot by about 11 K. Moreover, the discussed OH T rot are based on Einstein coefficients and level energies from the HITRAN database (Rothman et al., 2013), which are related to . The results change if we use the molecular parameters of van der Loo andGroenenboom (2007, 2008) (Fig. 14). The T rot are more uncertain and tend to be lower than those based on HITRAN data. If we neglect the very uncertain data point for v ′ = 2 (see Noll et al.,Figure 15. v ′ -dependent non-LTE contributions to OH Trot based on the first three P1 lines and HITRAN data  for five nighttime LT periods derived from the full X-shooter sample. The LTE base temperature for each v ′ was derived from the corresponding O2a(0-0) Trot corrected for the difference in the VER profiles. The uncertainties of the absolute ∆Tnon-LTE are similar to those given in Fig. 13 (excluding the unknown uncertainties of the molecular parameters). Comparisons of ∆Tnon-LTE for the different night-time periods and/or v ′ are more reliable. The errors should be of the order of 1 K, i.e. very likely smaller than 2 K. 2015), the mean shift is 4.2 K. Hence, only the ∆T non-LTE of 10 K for v ′ = 8 remains significant with more than 3σ. Although the T rot based on van der Loo and Groenenboom (2008) are less reliable than those based on , these results demonstrate that the uncertainties of the molecular parameters limit the accuracy of ∆T non-LTE that can currently be achieved. Nevertheless, ∆T non-LTE of more than 10 K for v ′ = 8 are plausible if the first three P 1 -branch lines are used. Finally, the results show that the consideration of emission profile differences does not significantly change the characteristic T rot (v ′ ) pattern found by Cosby and Slanger (2007) and Noll et al. (2015). It can only be explained by non-LTE effects. Noll et al. (2015) studied the nocturnal variations of the OH T rot by means of five nighttime periods, which we also used in Figs. 9 and 10. The analysis resulted in HITRAN-based T rot differences (i.e. maximum minus minimum) between 4.8 K for v ′ = 5 and 7.0 K for v ′ = 6. In order to separate non-LTE from temperature profile variations, we used the observed O 2 a(0-0) profiles as reference for an emission profile correction. We did not take the reference profile at 90 km, which was the basis for the discussion in Sect. 4.1, since this reduces the uncertainties related to the temperature correction from about 1.2 to 0.9 K on average. At the beginning/end of the night, the uncertainties are higher for high/low v ′ . The resulting ∆T non-LTE are displayed in Fig. 15. Compared to the uncorrected T rot , the time-related variation for fixed v ′ significantly decreased. The ∆T non-LTE differences range from 0.6 K for v ′ = 3 to 3.6 K for v ′ = 9. Hence, the average nocturnal variations of the non-LTE contributions to the OH T rot are significantly lower than their actual values. Moreover, at least the low v ′ indicate that the T rot variation is obviously dominated by the variability of the real temperature profile. However, it seems that only one third of the T rot (v ′ = 9) variability can be explained in this way. Figure 15 indicates a nocturnal trend of increasing ∆T non-LTE . The trend appears to be stronger for higher v ′ . The significance is about 3σ for the differences between the first and last interval and is limited by the statistical mean errors with an average of about 1.1 K and at least part of the temperature correction errors due to the changing O 2 a(0-0) emission profile. Noll et al. (2015) discovered a significant increase of the T rot differences of adjacent intermediate v ′ for the second compared to the first nighttime period. In Fig. 15, this causes a relatively strong ∆T non-LTE increase for v ′ = 6 but a decrease for v ′ = 5 and 7. In order to better evaluate the reliability of these trends, we also calculated ∆T non-LTE based on OH(v ′ = 2) instead of O 2 a(0-0) emission profiles. Although the absolute ∆T non-LTE are less trustworthy and possible weak non-LTE T rot variations for v ′ = 2 are neglected, this approach is justified for the comparisons of nighttime intervals and v ′ since it can further reduce the uncertainties related to the temperature correction. The emission profile differences between OH(v ′ = 2) and the other OH bands are much smaller and less variable than those related to O 2 a(0-0). The resulting Fig. 16 does not show a ∆T non-LTE decrease between the first and second nighttime period. However, significant increases appear to be limited to v ′ = 6, 8, and 9. For v ′ = 5 and 7, the major change happens between the third and fourth period, i.e. after midnight. The last period indicates a decrease of ∆T non-LTE for most v ′ . For v ′ ≥ 6, the ∆T non-LTE increases between the first and fourth period vary between 3.6 K (v ′ = 7) and 4.3 K (v ′ = 8). The significance is higher than for the results based on O 2 a(0-0) as reference band.

Nocturnal variations of temperature differences
The found ∆T non-LTE variations appear to be in agreement with those of the OH emission layer height (see Sect. 1). Up to the fourth nighttime period, the sample-averaged h eff increases by 0.9 km (v ′ = 9) to 1.7 km (v ′ = 2). Afterwards the emission layers sink again except for v ′ = 9. See Fig. 10 for the h eff trend related to v ′ = 8. As already discussed by Noll et al. (2015) based on T rot differences and vibrational level populations, a higher emission layer is in a lower density environment, which decreases the frequency of thermalising collisions by O 2 (e.g. Adler- Golden, 1997;von Savigny et al., 2012;Xu et al., 2012) and, consequently, can increase the T rot non-LTE effects. The SABER data show that the h eff increase is more related to a VER decrease at low altitudes than an increase in the upper part of the emission layer. This trend mainly depends on changes in the distribution of atomic oxygen (which is required for the O 3 production) by chemical reactions and atmospheric dynamics (Lowe et al., 1996;Marsh et al., 2006;Nikoukar et al., 2007). In order to better estimate the effect of the VER profile change on the T rot non-LTE effects, we calculated the fraction of OH emission above a certain altitude as a function of the nighttime period. For this, we used the original VER profiles of the two SABER channels at 1.64 and 2.06 µm corresponding to v ′ = 4.6 and 8.3, respectively (Sect. 3.2.1). For a minimum altitude of 95 km, we found an increase from 4.7 % (first period) to 7.3 % (fourth period) for v ′ = 4.6 and from 9.4 % to 12.0 % for v ′ = 8.3. The differences are similar for both SABER channels but the absolute values are higher for v ′ = 8.3. The latter effect and the relative change of the emission fractions with time enhance with increasing minimum altitude. These results show the large impact of a relatively small rise of the emission layer on the highest parts of the VER profile. To fully understand Fig. 16, it will be necessary to model OH T rot depending on v ′ and altitude, which has not been done so far. While there is some knowledge on the nascent population distribution over the rotational levels of the different v (Llewellyn and Long, 1978), it is not clear how this distribution is modified by collisional quenching. Figure 17 illustrates the effect of the changing emission layers and temperature profile on the measured T rot by comparing T rot , T eff , and T kin for the first and second half of the night. The plots are similar to Fig. 12 discussed in Sect. 4.1. The OH T rot show a wider spread in the second half of the night. This can partly be explained by a significant change of the temperature profile. The SABER data reveal an average nocturnal trend from a single deep mesopause temperature minimum at the beginning to a double minimum, where the secondary trough is almost as deep as the primary one. The average altitudes of both minima are 98 to 99 km and between 80 and 85 km. Observations of a secondary minimum are not unusual (von Zahn et al., 1996;Shepherd et al., 2004;Friedman and Chu, 2007;Clemesha et al., 2011) and can be explained by tidal perturbations and tidegravity wave interactions (Meriwether and Gardner, 2000;Meriwether and Gerrard, 2004). The appearance of a secondary mesopause minimum causes a temperature inversion at the main OH emission altitudes. The comparison of the second with the first half of the night, therefore, shows a T kin increase above and decrease below 85 km. The OH T eff are higher in the second period and this increase is stronger for higher v ′ due to the different temperature gradients. The amounts are 1.9 K for v ′ = 2 and 3.5 K for v ′ = 9. Nevertheless, a T rot increase with v ′ remains, which is disturbed by the odd/even v ′ dichotomy, as indicated by Figs. 15 and 16. The O 2 -related T rot follow the changes of the temperature profile convincingly. As expected, the emission height, profile width, and temperature for O 2 a(0-0) are much more similar to those for the reference profile (Sect. 3.2.3) in the second half of the night. Noll et al. (2015) also studied the T rot (v ′ ) of the four meteorological seasons. The X-shooter data indicated higher temperatures for the equinoxes than for the solstices irrespective of v ′ . The maximum differences were between 7.4 K for v ′ = 2 and 3.3 K for v ′ = 8. After applying a temperature correction based on O 2 a(0-0), these differences decreased to 4.7 and 1.2 K, as shown in Fig. 18. These values imply stronger non-LTE effects near the equinoxes than near the solstices. Moreover, the effects seem to be stronger for lower v ′ . This is contradictory to the results and explanations discussed in Sect. 4.2. The apparent trend is most probably caused by an inefficient temperature correction. Although the reduction of the seasonal variations by the conversion from T rot to ∆T non-LTE was even the largest for low v ′ (2.7 vs. 2.1 K, see above), it was obviously insufficient to obtain reliable seasonal variations of the OH T rot non-LTE contributions. This is probably related to the TIMED yaw cycle of 60 days, which causes large gaps in the seasonal coverage for fixed local times (Sects. 1 and 2.2). Therefore, the true seasonal variations could be stronger than those used for the temperature correction. At Cerro Paranal and similar latitudes, studying seasonal variations is more challenging than an analysis of the nocturnal variations since the com-bination of semi-annual and the weaker annual oscillations for OH T rot (Takahashi et al., 1995;Gelinas et al., 2008) is more complex than the night trends (Takahashi et al., 1998;Gelinas et al., 2008;Noll et al., 2015), which are dominated by the solar migrating diurnal tide (Marsh et al., 2006;Smith, 2012). Hence, the limitations of our SABER and X-shooter data sets do not allow us to conclude on seasonal variations of the OH T rot non-LTE effects.

Conclusions
Based on the VLT/X-shooter data set of Noll et al. (2015) with 343 spectra, we could compare their measurements of rotational temperatures T rot from 25 OH bands covering v ′ = 2 to 9 with new results for the bands O 2 b(0-1) and O 2 a(0-0). For the latter band, it was just the second time that T rot could be derived and the first time that it could be done with spectroscopically resolved lines. Since O 2 a(0-0) emits at similar altitudes as OH and the rotational level populations appear to be consistent with a Boltzmann distribution, the accurate T rot for O 2 a(0-0) could be used to quantify the non-LTE contributions to the OH T rot . To correct for the different emission profiles and the corresponding VERweighted T eff , we used OH, O 2 a(0-0), and CO 2 -based T kin profile data from TIMED/SABER suitable for Cerro Paranal. The correction is particularly important for the beginning of the night when the O 2 a(0-0) emission peaks several km below the nighttime average due to a decaying daytime population from ozone photolysis. From the X-shooter data set, we obtained an effective population lifetime of about 46 min, which is in good agreement with SABER and other data. The required emission profiles for the different OH v ′ were derived from the two OH SABER channels under the assumption that emission peaks are equidistant for adjacent v ′ . This approach is supported by a comparison of the effective emission heights h eff with temperature correlation coefficients for the X-shooter and SABER samples, which showed nearly linear relations.
For an O 2 a(0-0)-based reference profile at 90 km, we compared the corrected temperature measurements and found that the results for O 2 b(0-1) (where a fixed emission profile at 94.5 km was assumed), O 2 a(0-0), and the SABER T kin agree very well within the errors. The sample mean temperature at the reference altitude was 188.0 ± 1.6 K. Except for the uncertain v ′ = 2 and 3, the resulting OH-related temperatures were significantly higher. For the first three P 1branch lines and HITRAN molecular data, a maximum difference of about 13 K was achieved for v ′ = 8. The characteristic T rot (v ′ ) pattern found by Cosby and Slanger (2007) and Noll et al. (2015) showing an odd/even v ′ dichotomy could also be seen in the corrected temperature data. The amount of non-LTE contributions to the OH T rot ∆T non-LTE strongly depends on the selected lines. A change of the line set affects all OH bands in a similar way. Apart from measure-ment and correction errors, the still uncertain molecular parameters limit the accuracy of the derived ∆T non-LTE to a few K.
An analysis of the nocturnal variations of ∆T non-LTE with five nighttime periods revealed that a significant fraction of the OH T rot variability is caused by changes in the emission and mesopause kinetic temperature profiles. However, fractions above 50 % could only be found for low v ′ . At Cerro Paranal, the residual non-LTE effects tend to increase during the night and to be more variable at higher v ′ . The largest ∆T non-LTE differences were found for the first (before 21:00 LT) and fourth period (between 01:00 and 03:00 LT), and v ′ ≥ 6. They amount to about 4 K with an uncertainty of about 2 K. The amplitude of the variations appears to be only a minor fraction of the total amount of the non-LTE effects. The observed trends could be explained by the nocturnal rise of the OH emission layer, which we see in the SABER data up to the fourth nighttime period, and a corresponding reduction of the thermalising collisions due to the lower effective O 2 density. Higher v ′ could be more affected since the related emission peaks at higher altitudes. We also studied possible seasonal variations of ∆T non-LTE . However, the limitations of the SABER and X-shooter data sets did not allow us to find convincing trends.
In agreement with Noll et al. (2015), non-LTE effects are critical for absolute mesopause temperature estimates from OH T rot and comparisons of bands with different v ′ . Dynamical studies based on T rot derived from OH bands with high v ′ could also be significantly affected. In this respect, only low v ′ appear to be sufficiently safe.