Formal blind intercomparison of OH measurements : results from the international campaign HOxComp

Hydroxyl radicals (OH) are the major oxidizing species in the troposphere. Because of their central importance, absolute measurements of their concentrations are needed to validate chemical mechanisms of atmospheric models. The extremely low and highly variable concentrations in the troposphere, however, make measurements of OH difficult. Three techniques are currently used worldwide for tropospheric observations of OH after about 30 years of technical developments: Differential Optical Laser Absorption Spectroscopy (DOAS), Laser-Induced Fluorescence Spectroscopy (LIF), and Chemical Ionisation Mass Spectrometry (CIMS). Even though many measurement campaigns with OH data were published, the question of accuracy and precision is still under discussion. Here, we report results of the first formal, blind intercomparison of these techniques. Six OH instruments (4 LIF, 1 CIMS, 1 DOAS) participated successfully in the ground-based, international HOxComp campaign carried out in Jülich, Germany, in summer 2005. Comparisons were performed for three days in ambient air (3 LIF, 1 CIMS) and for six days in the atmosphere simulation chamber SAPHIR (3 LIF, 1 DOAS). All instruments were found to measure tropospheric OH concentrations with high sensitivity and good time resolution. The pairwise correlations between different data sets were linear and yielded high correlation coefficients (r2=0.75−0.96). Excellent absolute agreement was Correspondence to: H.-P. Dorn (h.p.dorn@fz-juelich.de) observed for the instruments at the SAPHIR chamber, yielding slopes between 1.01 and 1.13 in the linear regressions. In ambient air, the slopes deviated from unity by factors of 1.06 to 1.69, which can partly be explained by the stated instrumental accuracies. In addition, sampling inhomogeneities and calibration problems have apparently contributed to the discrepancies. The absolute intercepts of the linear regressions did not exceed 0.6 ×106 cm−3, mostly being insignificant and of minor importance for daytime observations of OH. No relevant interferences with respect to ozone, water vapour, NOx and peroxy radicals could be detected. The HOxComp campaign has demonstrated that OH can be measured reasonably well by current instruments, but also that there is still room for improvement of calibrations.


Introduction
The hydroxyl radical (OH) is the key reactant for the degradation of most compounds emitted from biogenic and anthropogenic sources into the troposphere, e.g. sulfur dioxide, nitrogen dioxide, carbon monoxide, methane, and volatile hydrocarbons (Ehhalt, 1999;Lelieveld et al., 2004). Most of these compounds and their degradation products have adverse impact on the environment because of their toxicity, global warming potential, or their stratospheric ozone depletion capability. OH radicals are primarily produced by photolysis of ozone and the subsequent reaction of the formed Published by Copernicus Publications on behalf of the European Geosciences Union. Minor sources are the photolysis of nitrous acid (HONO) and hydrogen peroxide, and the ozonolysis of alkenes. The major secondary OH source, i.e. from other radical species, is the reaction of nitric oxide (NO) with hydroperoxy radicals (HO 2 ). Lifetimes of OH vary between 1 s and 10 ms in clean and polluted environments, respectively, due to the rapid reactions of OH with atmospheric trace gases. Given the high reactivity and correspondingly short lifetime, the tropospheric OH concentration is generally low (sub-ppt) and highly variable. At night, when the photolytic production vanishes, OH concentrations have been observed at levels as low as a few 10 4 cm −3 in clean marine air (Tanner and Eisele, 1995), up to values around 10 6 cm −3 at a forest site . At daytime when OH generally correlates well with solar UV flux, concentrations can reach maximum values of 10 7 cm −3 (0.4 ppt) at noon in clean and polluted environments (e.g. Eisele et al., 1996;Martinez et al., 2008).
Since the 1970s OH radicals are recognised to be the major oxidant in the atmosphere converting more than 90% of the volatile organic matter (Levy, 1974). Since then many attempts were made to measure OH concentrations in the troposphere by various techniques (see review by Heard and Pilling, 2003). For the first time tropospheric OH was detected by Perner et al. (1976) in Jülich using Differential Optical Absorption Spectroscopy. DOAS based OH instruments were also developed in Frankfurt (Armerding et al., 1994) and Boulder (Mount et al., 1997). However, currently only one instrument is being operated by the Jülich group in field and chamber campaigns Brauers et al., 2001;Schlosser et al., 2007). The most widely applied OH measurement technique is Laser-Induced Fluorescence (LIF) combined with a gas expansion, also known as Fluorescence Assay with Gas Expansion (FAGE) (e.g. Hard et al., 1984;Stevens et al., 1994;Holland et al., 1995;Creasey et al., 1997;Kanaya et al., 2001;Dusanter et al., 2008;Martinez et al., 2008). LIF instruments directly measure OH with high sensitivity and can be built compact for mobile operation. Chemical-Ionisation Mass-Spectrometry (CIMS) is an indirect OH measurement technique with very high sensitivity and good mobility for ground and aircraft field campaigns comparable to LIF instruments (Eisele and Tanner, 1991;Berresheim et al., 2000). Long term monitoring of OH concentrations has only been demonstrated using CIMS (Rohrer and Berresheim, 2006). All three techniques (DOAS, LIF, CIMS) involve elaborate, expensive, custom-made experimental setups with vacuum pumps, laser systems, and/or mass spectrometers. Therefore, worldwide less than ten groups measure atmospheric OH using these techniques. Other techniques, e.g. the salicylic acid scavenger method (Salmon et al., 2004) or the radiocarbon tracer method (Campbell et al., 1986) do not reach the quality standards of accuracy, sensitivity and time resolution provided by LIF, CIMS, and DOAS.
Atmospheric OH radicals have been elusive and hard to measure (Brune, 1992), because: -low OH concentrations require extremely sensitive detection techniques, which are not readily available, -OH reacts efficiently at wall surfaces requiring precautions to avoid instrumental OH loss, -most other atmospheric species are much more abundant, raising the potential for interferences in OH detection, -stable calibration mixtures for OH do not exist; therefore, calibration requires a technical OH source which produces accurately known amounts of OH radicals.
Initial attempts to measure atmospheric OH by DOAS were successful, but required very long absorption path lengths (10 km) and long integration times of about 1 h (Perner et al., 1987;Platt et al., 1988). Attempts in the 1970s and 1980s to measure atmospheric OH by LIF and the radiocarbon tracer method failed as a result of insufficient detection sensitivity, poor technical performance or interference problems. This was demonstrated in an OH intercomparison of two LIF instruments and one radiocarbon technique during the CITE 1 mission 1983/84 (Beck et al., 1987) and a corresponding NASA funded expert workshop on H x O y measurements (Crosley and Hoell, 1986). The self-generation of OH by laser photolysis of ozone (reactions R1 and R2) turned out to be a major obstacle that hindered reliable OH measurement by LIF methods for many years (see Smith and Crosley, 1990, and references therein). In the beginning of the 1990s, major progress was achieved in terms of detection sensitivities, development of calibration sources and suppression of interferences, providing the basis for fast, sensitive OH measurements by DOAS, LIF, radiocarbon tracer and the newly developed CIMS technique (Crosley, 1994). In the following years, given the experimental effort, only five intercomparisons of atmospheric OH measurements were reported: -A ground based OH photochemistry experiment (TOHPE) took place at Fritz Peak, Colorado, in 1991 and1993. Four OH measurement instruments were deployed, but a meaningful intercomparison could only be done for two of them. The NOAA long path DOAS instrument (20.6 km path length using a retro-reflector) and the Georgia Tech CIMS instrument probed different parts of the atmosphere, but provided data with good correlation (r 2 =0.62) in 1993 (Mount et al., 1997). A linear fit to data (N=140) selected for clear days and low NO x revealed a slope (OH-CIMS/DOAS) of 0.82±0.06 (with correction: 0.95±0.07) and an insignificant intercept.
-During a campaign at a clean-air-site near Pullman, in eastern Washington State, USA, in 1992, a LIF instrument of the Portland State University (PSU) and a 14 CO radiocarbon instrument operated by Washington State University (WSU) were involved (Campbell et al., 1995). The OH concentrations were near the limit of detection and the LIF instrument required an integration time between 30 min and 60 min per measurement. The correlation coefficient for the two data sets was high (r 2 =0.74), but the slope of the regression was 3.9±1.0 1 , indicating calibration problems.
-The Jülich DOAS (38.5 m between multi-path mirrors, 1.85 km total path length) and LIF instruments, both operated by the Jülich group, were compared during the field campaign POPCORN in rural Germany in 1994 Hofzumahaus et al., 1998). Excluding a possibly contaminated wind sector, the instruments agreed well with r 2 =0.80 (N=137). The linear regression yielded a slope of 1.09±0.04 and an insignificant intercept.
-Two aircraft based campaigns were used to compare OH measurements of the NCAR CIMS instrument aboard the P-3B aircraft and those of the Penn State LIF instrument aboard the NASA DC-8. During 1999 PEM Tropics B the ratio of the average OH measured by LIF/CIMS increased from 0.8 near the surface to 1.6 at 8 km altitude (Eisele et al., 2001). The TRACE-P campaign in 2001 involved three 0.5 h to 1.5 h comparison periods when the planes flew within 1 km distance. The correlation yielded a r 2 =0.88 and an approximate slope (CIMS/LIF) of 1.58 with a negligible intercept (Eisele et al., 2003). The OH data of the Penn State LIF were later revised because an error in the calibration of the primary standard, a photomultiplier tube (PMT) used to measure the photon flux, was found and the revised values are a factor of 1.64 higher (Ren et al., 2008). A slope of 0.96 is found, i.e. the two instruments agree, if the slope reported earlier is divided by this factor.
-The Jülich DOAS (20 m between multi-path mirrors, 2.24 km total path length) and LIF instruments were again compared by the Jülich group in their atmosphere simulation chamber SAPHIR . The correlation was excellent (r 2 =0.93) based on 400 data points. A marginal intercept and a slope of 0.99±0.13 were found.
In this study, we present the first formal, blind intercomparison of OH measurements conducted as part of the European funded ACCENT program (Atmospheric Composition of the Atmosphere: the European Network of Excellence). All international groups worldwide operating OH in-struments were invited to participate. The groups from Germany (Deutscher Wetterdienst, Max-Planck Institut Mainz, Forschungszentrum Jülich), UK (University of Leeds) and Japan (Frontier Research Center for Global Change) took part in the corresponding campaign HOxComp (HO x intercomparison) with seven different instruments (5 LIFinstruments, 1 CIMS, and 1 DOAS), each using their own calibration scheme. Due to an unfortunate laser system failure, the instrument of the UK group (Creasey et al., 1997) did not produce any measurements. The following paper is therefore dealing with results of the remaining four groups (see Table 1). The campaign was performed as a two stage experiment with three days of measurements in ambient air and six days of measurements in the atmosphere simulation chamber SAPHIR on the campus of the Forschungszentrum in Jülich. The goal was the quality assurance of instruments used for detection of atmospheric OH (this work) and HO 2 , addressing the following questions: -are current instruments (DOAS, LIF, CIMS) capable of measuring atmospheric OH and HO 2 unambiguously?
-are the measurements free of interferences?
-are the measurements correct and do they agree within the stated accuracies of their calibrations?
The whole process of formal blind intercomparison, the measurements and their evaluation, was independently refereed by Ulrich Schurath from Forschungszentrum Karlsruhe, Germany.

The OH instruments
An overview and specifications of the six instruments that provided OH measurement data are given in Table 1. Detailed descriptions are quoted in the last column, while summaries of the OH instruments are given in the following:

DOAS (FZJ), Forschungszentrum Jülich, Germany
The FZJ-DOAS which has been deployed previously in field and chamber campaigns Brauers et al., 2001;Schlosser et al., 2007) was used only for measurements in the SAPHIR chamber. It uses a ps-pulsed mode-locked UV-laser as light source in combination with a multiple-reflection cell (White system, base length 20 m, light path length 2240 m). The Multi-Channel Scanning Technique (MCST)  is used to reduce the noise of the photo diode array (PDA, 1024 channels, cooled to 238 K), which enables the instrument to detect a narrow banded absorbance of the order 10 −5 at high spectral resolution ( λ=2.7 pm). The measurement time interval is 135 s and one measurement cycle takes 3 min. The   References: 1: Berresheim et al. (2000) 2: Kanaya et al. (2001); Kanaya and Akimoto (2006) 3: Martinez et al. (2008) 4: Holland et al. (1995 5: Dorn et al. (1995);Hausmann et al. (1997);Schlosser et al. (2007) spectrum is de-convoluted by fitting a trigonometric background and three to five reference spectra (OH, HCHO, a so far unidentified absorber X, and additionally SO 2 and naphthalene in case of ambient air measurements). The precision is calculated for each measurement from the bootstrap error estimate and residual inspection by cyclic displacement (Hausmann et al., , 1999. For this instrument the precision was determined to be 1.2×10 6 cm −3 for 135 s time intervals . Additional OH radicals may be formed by photolysis of O 3 within the probe volume of the UV laser beam. The amount of this artificially produced OH depends, e.g. on the O 3 and H 2 O concentrations, OH lifetime, the UV laser power, and the dwell time of the air within the volume probed by the laser beam . Under adverse conditions that promote artificial OH generation (high O 3 concentration (143 ppb), no air movement in the dark chamber, long OH lifetime) an offset of (2.9±0.1)×10 6 cm −3 per 1 mW of UV laser power was detected . This effect was reduced by convection when the chamber was exposed to sunlight and when a fan was operated, e.g. during an experiment with high O 3 concentration which took place in the dark chamber on 22 July 2005. Additionally, the UV laser power was limited to maximum 1 mW and monitored to keep this interference well below 0.2×10 6 cm −3 .
No field calibration is needed for the FZJ-DOAS because OH concentrations are directly derived from the measured optical densities. Therefore, the accuracy of the DOAS instrument is mainly limited by the uncertainty of the effective rovibronic absorption cross sections in the probed wavelength range (308.00 nm to 308.18 nm) of the OH A 2 + (υ =0)←X 2 (υ =0) transition which is approximately 3% . A maximum uncertainty of 6.5% was stated by Hausmann et al. (1997) and is supported by chamber experiments (Poppe et al., 2007).

LIF (FRCGC, FZJ, MPI)
Four LIF instruments contributed measurement data during this campaign (see Table 1). LIF can be used for the sensitive and fast direct detection of OH and the indirect detection of HO 2 and RO 2 after chemical conversion to OH. Current techniques probe the OH radicals after expansion of ambient air through an inlet nozzle into a detection chamber at a pressure of a few hPa. Single rovibronic lines of the OH A 2 + (υ =0)←X 2 (υ =0) transition are excited by pulsed UV laser light near 308 nm and resonance fluorescence in the (307-311 nm) range is detected by gated photon counting perpendicular to the gas beam and the laser beam. The background signal, resulting from scattered laser radiation and solar stray light, is determined and subtracted for each OH measurement using an on-and off-resonance tuning cycle. Raw data is normalised using the measured laser power and corrected for fluorescence quenching by water vapour. Calibration is performed with known concentrations of OH radicals, which are generated by photolysis of water vapour at 184.9 nm. The instruments vary in their technical details such as the nozzle and low pressure chamber geometry and volume, laser models and light guidance, detection volume geometry and detector types, architecture and custom made calibration units (see below, Sect. A, and Table 1).
All LIF instruments measured additionally HO 2 . The measurement involves chemical conversion of HO 2 to OH by addition of NO in the gas expansion, followed by LIF detection of the additionally formed OH. The OH and HO 2 measurements can be performed in a single chamber in an alternating mode or in two detection chambers, which are coupled or completely separated.
The FRCGC-LIF instrument of the Frontier Research Center for Global Change, Yokohama, Japan has been deployed in several field campaigns in Japan (Kanaya et al., 2007a,b). The setup includes a single detection cell, in which OH and HO x are measured alternately (Kanaya et al., 2001;Kanaya and Akimoto, 2006). Short periods between these measurements are used to measure the background and to scan and to lock the laser wavelength. Concentrations of HO 2 are calculated from the difference of the measured HO x and 10 minaveraged OH levels. A black aluminum disk (halocarbon wax coated) was used as sun shade for ambient measurements in order to reduce solar background in the measurement signals. In previous experiments with a different laser system only a small power dependent correction for OH from the laser photolysis of ambient ozone has been established (Kanaya et al., 2007a). With the 10 kHz laser system used in the present work (average laser power: (5-9) mW at 308 nm) the correction is considered to be negligible.
Two LIF systems of the Forschungszentrum Jülich, Germany, were operated during the campaign. The FZJ-LIF-SAPHIR was used for measurements at the chamber only while the FZJ-LIF-ambient was used at the field site. Both instruments differ in their setup, but are based on the same concept (Holland et al., 1995).
The FZJ-LIF-SAPHIR instrument was previously used for field campaigns (e.g. Holland et al., 2003) and is now permanently installed at the SAPHIR chamber. FZJ-LIF-SAPHIR compared very well within 10% to FZJ-DOAS in previous tests (Hofzumahaus et al., 1998;Schlosser et al., 2007). For the present measurements, FZJ-LIF-SAPHIR was modified by replacing the formerly used copper-vapour laser pumped dye laser system by a frequency doubled Nd-YAG (DPSS Spectra Physics Navigator I) pumped tuneable, frequency doubled dye laser (NLG Tintura) with a total laser power of (35-40) mW at 308 nm. The FZJ-LIF-SAPHIR instrument uses two detection chambers for separate detection of OH and HO 2 , each equipped with its own inlet nozzle. The separation of the two detection cells avoids potential contamination of the OH cell by NO which is used for HO 2 conversion in the other chamber. In the current setup, ozone-related interference signals were not noted within the limit of OH detection for ozone concentrations up to 260 ppbv at 1.4% of water vapour. Therefore, no ozone-related correction was performed in this work. The OH calibrations were reproducible from day to day within 5%, except for an unusually low OH detection sensitivity noted in the calibration of 22 July 2005. The measurements of this day were marked "not valid" to indicate a potential calibration problem. A large laser-power dependent background signal led to a considerably higher OH detection limit of 25×10 5 cm −3 (S/N=2, t=30 s) compared to earlier campaigns. The accuracy of the calibration is estimated to be 20% (2σ ).
The FZJ-LIF-ambient instrument was first operated during the ECHO campaign  using the same concept as FZJ-LIF-SAPHIR. The construction and operating conditions of the OH and HO 2 detection chambers are actually the same, but electronics, gas handling system and vacuum pump are designed to be smaller and light-weight, making the instrument suitable for mobile applications. The compact laser system (DPSS Photonics DS20-532; dye laser LAS Intradye) had a total UV power of 25 mW, which was directed sequentially through the OH and HO 2 chambers, and a reference cell for controlling the OH wavelength. An ozone interference of 0.7×10 4 cm −3 per ppb O 3 was determined and taken into account during evaluation. No power dependence for the parametrisation was needed, because the monitored laser power was virtually constant. The OH detection limit was 5.3×10 5 cm −3 (S/N=2, t=137 s) and the reproducibility was 13%. The calibration and the accuracy is the same as for FZJ-LIF-SAPHIR.
The MPI-LIF instrument of the Max-Planck-Institut, Mainz, Germany was developed mainly for mobile platforms as a highly time resolved field instrument for OH and HO 2 measurements . The LIF instrument is based on concepts developed by W. H. Brune and coworkers (Faloona et al., 2004). The further development of the MPI-LIF design incorporates a Nd-YAG system as pump laser (2nd harmonic 532 nm, 2.6 W at 3 kHz) for an intracavity frequency doubled tunable dye laser. The wavelength is line-locked on the Q 1 (2) line signal from a reference cell in which OH radicals are produced by H 2 O thermolysis using a hot filament. Light is guided by UV-fibres to the detection cells (average laser power coupled into the OH channel: (2-20) mW at 308 nm) and fluorescence is detected by multichannel plates. Unlike the other LIF instruments it uses a multi-reflection cell (White system) to enhance the number of fluorescence photon counts and thus sensitivity and features a tandem detection cell setup. Ambient air is expanded through a nozzle into a low-pressure fluorescence cell where first OH radicals are detected by LIF. NO is then added to the gas beam that leaves the OH detection cell to convert HO 2 to OH for the (indirect) HO 2 detection within the second detection cell. The cell geometries are designed to prevent a pollution of the OH detection cell with NO which is injected between OH and HO 2 stages, thereby preventing interference of HO 2 with the detection of OH. In contrast to OH measurements at daylight a significant and variable OH background signal was often observed at periods without daylight during experiments previous to HOxComp. Therefore all OH measurements by the MPI-LIF at times without daylight were submitted to the referee as not valid. The reason of this effect is not yet understood. Studies have verified though that the interference is not due to laser-induced OH generation.

CIMS (DWD), Deutscher Wetterdienst, Hohenpeissenberg, Germany
The DWD-CIMS instrument is usually installed at the Meteorological Observatory Hohenpeissenberg where it is in operation almost continuously since 1998 (see Rohrer and Berresheim, 2006). Its operation at the HOxComp field site was identical to the routine operation. The OH detection by CIMS is based on the work of Eisele and Tanner (1991) and has been described by Berresheim et al. (2000) with some modifications as outlined below. The measurement principle includes continuous sampling of ambient air, followed by chemical titration, ion reaction, cluster dissociation, and mass selective detection. Ambient air (2400 slm) is pumped through a 100 mm wide tube with a smooth, ring shaped inlet. The central part of the flow is sampled 120 mm below the intake at a flow rate of 16 slm through a conical nozzle (10 mm diameter). 34 SO 2 is injected at the front edge of the nozzle and forms H 2 SO 4 from OH. Propane is added downstream to scavenge 98% of the recycled OH. The remaining 2% of recycled OH and ambient H 2 SO 4 are determined by background measurements using propane instead of SO 2 as OH scavenger. The processed sample gas is transferred through a 900 mm long tube to the ion reaction zone where NO − 3 ions are added to the sample from a sheath gas flow. H 2 SO 4 is deprotonated by NO − 3 ions and then the ions are selectively transferred into a vacuum system. Ion clusters are decomposed in a collisiondissociation unit and are refocused by electrical lenses to the quadrupole mass spectrometer (Extrel Inc.). An OH measurement cycle lasts 30 s, of which 8 s are used to obtain the ambient OH signal and another 8 s for the background signal.
Modifications of the current instrument setup were applied since its description by Berresheim et al. (2000): (1) A nozzle at the head of the sampling tube reduces disturbance in the titration zone due to cross-wind.
(2) The sample inlet (nozzle) has been moved to 120 mm below the air inlet (from 300 mm) to minimise OH losses and chemical interferences in the inlet region.
(3) The sample flow rate was increased from 10 slm to 16 slm yielding better signal-to-noise ratios and reduced chemical interferences in the titration zone. (4) The length of the sample tube transferring H 2 SO 4 to the ion reaction zone was changed from 300 mm to 900 mm (to transfer the gas through the ceiling of the laboratory at Hohenpeissenberg). Losses in the nozzle and sample tube are routinely accounted for in calibration measurements.
The OH concentration is obtained after correction for background H 2 SO 4 and inlet chemistry, recycling OH from NO+HO 2 , and OH-losses by CO, NMHC, and NO 2 . The DWD-CIMS is designed for fairly clean atmospheric conditions, while in Jülich mostly polluted conditions were encountered. Therefore, the correction factors to compensate for chemically induced changes of OH in the intake were higher than at Hohenpeissenberg, typically corrections of 30% with an uncertainty of ±11% were applied. An accuracy of 38% (2σ ) results from the uncertainties of the CIMS calibration and the chemical correction factors during HOxComp. The precision of the OH measurements is 0.22×[OH]+0.19×10 6 cm −3 (2σ , signal integration time: 8 s).
The CIMS instrument was operated during the ambient and chamber parts of the campaign. However, only the sample tube with the nozzle was installed at SAPHIR, since in chamber experiments an intake flow of 2400 slm was not possible. Thus, the air intake system was substantially changed from routine operation and the use of the DWDcalibration-unit (Sects. 2.1.4 and A) was not possible. Although, for most times good and consistent results with other OH-measurement systems on a relative scale were achieved, it was decided to flag the results from the chamber as the substantially modified intake-system had not been characterised and the sensitivity of the system could not be quantified adequately 2 .

Calibration
DOAS is based on Lambert's law and needs only the light path length and effective absorption cross section of OH. All LIF instruments and CIMS require a calibration in order to convert the measured signals into OH concentrations. Calibration is achieved by providing a well-known OH concentration. The common technique for accurately quantified OH production is the photolysis of water vapour at 184.9 nm (mercury lamp) in a flow of (synthetic) air at ambient pressure, from which calibration gas is sampled (Aschmutat et al., 1994;Schultz et al., 1995;Kanaya et al., 2001;Bloss et al., 2004;Faloona et al., 2004;Martinez et al., 2008;Dusanter et al., 2008). The photolysis yields equal concentrations of OH and HO 2 , which can be calculated from a few experimental parameters: Here, σ H 2 O is the well known absorption cross section of water vapour at 184.9 nm. Its value of 7.14×10 −20 cm 2 (at 25 • C) measured by Cantrell et al. (1997) was confirmed within 2% by Hofzumahaus et al. (1997) and Creasey et al. (2000). The water vapour concentration [H 2 O] can be measured accurately, e.g. by a dew point hygrometer. The other parameters are the actinic flux of the 184.9 nm radiation and the exposure time t of the calibration gas. Each experimental group (LIF, CIMS) had its own calibration device and its own method to measure as explained in Appendix A. The resulting accuracies of the calibrations are listed in Table 1.
The DWD-CIMS instrument has a built-in calibration unit within the instrument's main air inlet tube. OH radicals are produced during ambient air sampling from photolysis of atmospheric water vapour by switching on a mercury lamp every 20 min for 5 min. In contrast, all LIF instruments have external radical sources, each of which consist of a flow tube, an illumination unit, and a supply of synthetic air. Calibration measurements were usually performed once a day during the campaign. Measured OH calibration factors, which showed no significant variability or trend, were averaged for the following time periods

Measurement sites
The HOxComp campaign took place on the campus of the Forschungszentrum in Jülich (50 • 54 33 N, 06 • 24 44 E). The instruments were set up within and partly on top of several containers. The formal part of the campaign included three days of ambient measurements from 9-11 July 2005 and six days of chamber experiments with the SAPHIR chamber from 17-23 July 2005. The weekend days 9-10 July 2005 had essentially no traffic on the campus of the Forschungszentrum. (2) temperature, relative humidity, HONO; (3) ultrasonic anemometer; (4) filter-radiometer; (5) O 3 . A road (closed for traffic) is located southeast and the site is bordered in the north and west by bushes and trees (marked by a green line). Liquid nitrogen and oxygen is stored in two tanks northeast of the chamber.

Field site
Ambient air measurements were located on the paved area between the institute building and the SAPHIR chamber (Fig. 1). The site is bordered by bushes, trees and a small road. The area is characterised by buildings, small roads, grassland and trees. The Forschungszentrum is surrounded by deciduous forest, agricultural areas, and main roads. The OH instruments were placed approximately 13 m east of the chamber side-by-side from north to south in the following order: DWD-CIMS, Leeds-LIF, MPI-LIF, FRCGC-LIF, FZJ-LIF-ambient with spacings between the instruments sampling inlets of approximately 2.9 m, 2.7 m, 3.2 m, and 4.5 m, respectively. All OH instruments sampled ambient air at about equal height (3.5 m) above ground. Standard instruments recorded humidity, NO x , O 3 , and meteorological data. Additional measurements of HONO, hydrocarbons, and photolysis frequencies were also conducted.

SAPHIR chamber
After the intercomparison measurements in ambient air, the containers housing the OH instruments were moved to the SAPHIR chamber and the OH instruments were installed. FZJ-DOAS probed along the axis of the chamber approximately 1.7 m above the inlets of DWD-CIMS, (Leeds-LIF), FRCGC-LIF, MPI-LIF, and FZJ-LIF-SAPHIR, which sampled air 2 cm to 13 cm from the chamber floor.
The atmosphere simulation chamber SAPHIR (Simulation of Atmospheric PHotochemistry In a Large Reaction Chamber) is designed to investigate tropospheric photochemistry under controlled chemical composition comparable to ambient air at ambient temperature, pressure and natural irradiation (e.g. Rohrer et al., 2005;Bohn and Zilken, 2005;Wegener et al., 2007;Poppe et al., 2007;Apel et al., 2008). It is constructed of a double-walled FEP cylinder (125 µm and 250 µm thickness; diameter 5 m, length 18 m, volume 270 m 3 ), held by a steel frame and stabilised to 50 Pa above ambient pressure. In addition to the slight overpressure, the volume between inner and outer FEP film is flushed with clean N 2 to exclude contamination of the chamber by ambient air. The FEP foil has a 85% transmission for visible light, UV-A, and UV-B. A louvre-system allows fast shadowing of the chamber.
One chamber experiment was performed per day (17-19, 21-23 July 2005). Each experiment started with overnight flushing of the dark chamber with ultra-pure synthetic air to reach low trace gas mixing ratios (NO x <10 ppt, CO<1 ppb, CH 4 <15 ppb, HCHO<50 ppt, hydrocarbons<10 ppt, O 3 <1 ppb, and H 2 O<0.05 mbar). In a second step Milli-Q water (Millipore) was evaporated and added to the purge flow to adjust the humidity. Trace compounds (e.g. O 3 , NO x , VOC, and/or CO) were then added while mixing was assured by operation of a fan for 30 min. After complete gas mixing, intercomparison measurements were started. During the experiments, photochemistry was controlled by the louvre system which allowed the chamber to be exposed to or shielded from solar radiation. Periods of 1 h were scheduled for the addition of trace gases during the experiments. The louvre system was closed, followed by 30 min in the dark with no other changes. Then, the chemical composition was changed in the dark chamber with the fan turned on. Photochemistry was resumed by opening the louvre system.
The gas replenishment flow of (5-10) m 3 h −1 of clean, dry synthetic air was used to compensate for sampling by extractive measurements (4.5 m 3 h −1 ) and for leakage, which caused a dilution of (2-3)% h −1 . Instruments were calibrated once a day subsequent to the experiments.

Data measurement protocol
The referee supervised all measurements and was the only person aware of all experimental details and authorised to change the experimental conditions. All groups synchronised their clocks (UTC, accuracy of time setting ±1 s). During the formal blind intercomparison measurements no communication of data or results was allowed between groups. Daily preliminary measurement data of each instrument was sent to the referee within 12 h after the end of an experiment. After the campaign, the groups prepared final data sets and questionable data was identified as part of the usual data analysis by each group, but not removed. Instead it was marked "not valid" with a quality flag indicating the reason. In some cases data of whole days was marked "not valid" for individual instruments because uncertainties in the calibration were noted (e.g. 22 July 2005 of the FZJ-LIF-ambient and all 6 days of chamber measurements of the DWD-CIMS). The MPI-LIF marked all measurements in the dark "not valid" because of measurement artefacts. Final data was submitted to the referee eight weeks after the campaign. OH data was then disclosed and discussed among the HOxComp participants during a workshop in Jülich four months after the campaign.
The group operating the two FZJ-LIF instruments became aware of a systematic error within their calibration after the submission of their data to the referee. The reason was technically simple but the error was not obvious and it had a significant effect on the calibration of the instrument. A mass flow controller which supplied synthetic air to the OH calibration unit had been incorrectly calibrated. This was discovered in 2006 during re-evaluation of a set of laboratory experiments that were performed before and after the HOxComp campaign in order to characterise the calibration unit. The revision entailed increases of the initially submitted OH concentrations by factors of 1.26 and 1.28 for the FZJ-LIFambient and the FZJ-LIF-SAPHIR, respectively. An accordant revision of their submitted data was authorised by the referee after discussing the planned change with the other groups.

Hydroxyl radical measurements
All valid OH concentrations for nine days of the formal blind intercomparison are shown in Fig. 2 using the original time resolution of each instrument. Six instruments were successfully deployed for OH measurements, but only four instruments each recorded valid data concurrently at the field site and at the SAPHIR chamber. The first row in Fig. 2 shows the data of the ambient measurements (MPI-LIF, DWD-CIMS, FRCGC-LIF, FZJ-LIF-ambient) and the lower two rows present all OH concentrations measured at the SAPHIR chamber (MPI-LIF, FRCGC-LIF, FZJ-LIF-SAPHIR, FZJ-DOAS). Two instruments (MPI-LIF and FRCGC-LIF) submitted valid data for both ambient and chamber experiments. On the 22nd, only the FRCGC-LIF and the FZJ-DOAS provided valid data.
Ambient measurements include whole diurnal cycles during variable weather conditions, whereas chamber experiments were usually performed between 06:00 and 15:30 (UTC). There are no valid OH measurements of the MPI-LIF at night-time or when the louvre system of the chamber was closed as explained in Sect. 2.1. The number of submitted data N and its mean precisionσ observed during this campaign are listed for the instruments separately for ambient and chamber measurements in Table 2. The different mean measurement time intervals ( t) per OH measurement range from 5 s (MPI-LIF) to 136 s (FZJ-DOAS). Directly related to the different t is the mean of precisions (σ ), which ranged between 3.3×10 5 cm −3 (FZJ-LIF-ambient) to 17×10 5 cm −3 (MPI-LIF) for this campaign's data.
The MPI-LIF has the highest data acquisition rate (10 s) and thus collected the largest data set (N total =20 564). Fast measurements entail a lower precision, which is (17 and 13)×10 5 cm −3 for ambient and chamber measurements, respectively. The DWD-CIMS uses a similarly short integration time of 8 s for each OH measurement, but the complete measurement cycle takes longer (30 s). No valid chamber measurements were submitted and N total is thus only 4032. The average precision is 4.2×10 5 cm −3 . Like the two previous instruments the FRCGC-LIF used a fixed time resolution, but it was changed during the campaign from 73 s to 51 s on 19 July. The mean precision was (11.0 and 9.0)×10 5 cm −3 . The FZJ-LIF-ambient used variable acquisition times (46 s to 355 s per measurement, mean 91 s). Time intervals were longer when the OH concentration was low in order to improve the limit of detection. The same applies to the FZJ-LIF-SAPHIR (24 s to 74 s), but mostly t was close to the mean of 36 s±4 s. Because of a high background signal and noise of the fluorescence detector, but also because of the shorter acquisition time, the standard deviation of the FZJ-LIF-SAPHIR data is 10×10 5 cm −3 , i.e. three times larger than the value for the FZJ-LIF-ambient (3.3×10 5 cm −3 ). The FZJ-DOAS has the lowest average time resolution with an almost fixed acquisition time of 136 s. The precision of the DOAS instrument depends on the optical alignment and is independent of the OH concentration. The precision was on average 8.1×10 5 cm −3 .

Data processing
The original data of the participating instruments has very different time resolutions. Therefore, data sets for pairs of instruments were created using the original time intervals of the instrument with longer time intervals per measurement by processing the data of the instrument with the higher timer resolution: In case of multiple data points of the latter instrument within one time interval, the average and the standard deviation was calculated. These values reflect the statistical and natural scattering of the OH measurements (external error) as well as the standard deviation stated for each measurement (internal error). This would be the preferred data basis for an intercomparison of two instruments, conserving the highest time resolution, because the natural OH concentration may vary rapidly according to the variable attenuation of sunlight. This variability is non statistical and not well represented by the precision of the OH measurements leading to different weights in the analysis. However, for comparing several instruments, for improving the precision, and for representation a common time resolution is needed and determined by the instruments with the longest time intervals (FZJ-DOAS, FZJ-LIF-ambient). Data of all instruments was processed accordingly using 300 s time intervals that suit all participating instruments. Between pairs of instruments that are compared, the number of concurrent measurements is allowed to differ. The results of the analysis of the data averaged to the time intervals of the instrument with the lower time resolution were analysed as a check for consistency and confirm the findings presented in this paper.
All valid OH measurements of the formal blind intercomparison, converted to averages over common 300 s time intervals are shown in the first row of Figs. 3, 4, and 5. The two lower rows present important chemical and physical parameters: NO and NO x , O 3 , CO, absolute humidity, and temperature.

Ambient measurements
The ambient measurements covered a 3-day period (9-11 July 2005). The OH data of all instruments with some key parameters are shown in Fig. 3 (300 s average). The period was characterised by moderate temperatures for the season peaking at 28 • C on 10 July. While the first day started with ground fog (until 08:10 UTC 2 ) and was later characterised by scattered clouds (as seen from the strong fluctuations of the photolysis frequencies), the second day was almost cloud free. The sunny weather continued on the last day until 14:00, when a rain storm evolved. Wind came almost invariably from northerly direction throughout all three days. Similar diurnal variations of trace gases were observed with high NO x in the morning hours (up to 30 ppb) and a relatively constant CO of 200 ppb on average. Short CO peaks   up to 320 ppb were encountered on 10 July 2005 at 10:00 and the 11 July 2005 at 08:00. VOC concentrations were dominated by up to 1 ppb benzene and toluene, each. Isoprene concentrations reached 1.6 ppb in the evenings, but ranged between 0.3-0.6 ppb during daytime and below 0.3 ppb at night. Ozone showed a typical diurnal profile with very low mixing ratios at night and a strong increase starting at 06:00. Peak O 3 , however, was moderate and barely reached 70 ppb.
Not all OH instruments submitted valid data for the entire three day period, e.g. the MPI-LIF skipped night data for reasons discussed before and the FRCGC-LIF and the DWD-CIMS ceased measurements because of the weather conditions during the thunder storm of the last day. In addition, no OH data was collected during times of calibration which were usually scheduled between 17:00 and 18:00, but the number and duration of calibration and maintenance periods differed between the instruments.

Chamber measurements at SAPHIR
Six days of formal chamber measurements took place from 17-23 July 2005. The first three days were used to test potential interferences by humidity, NO x , and O 3 , respectively (Fig. 4). The instruments were compared in the chamber flushed with outside air on day 4. The following day was spent to investigate the ozonolysis of alkenes as a radical source in the dark. On the last day, OH was measured during photo-oxidation of a mix of hydrocarbons. Measurements of the last three days (21-23 July 2005) are presented in Fig. 5.

Test for interferences by water vapour
Four humidity levels were tested starting with the flushed, clean, dry chamber at a water vapour partial pressure below 0.07 mbar (dew point −44 • C). Each test phase lasted two hours of which one hour was needed to change the gas mixture in the dark and one hour was used to expose the cham-ber to the sun (see 1st column in Fig. 4). On 17 July 2005 the sky was cloud free and it was very sunny, with moderate temperatures (278 K to 295 K). The main source for OH radicals is the photolysis of HONO that is released by the chamber wall. The water dependent HONO source has been described for SAPHIR with a heterogeneous formation term in the dark and a photolytic term (Rohrer et al., 2005). At the beginning of the experiment the HONO concentration was below (3±1) ppt and increased up to approximately 450 ppt for the highest water concentration. Another important radical source is the photolysis of HCHO which is photochemically released by the chamber. Its concentration was below the detection limit at the beginning of the experiment (<0.07 ppb) and increased to 3 ppb at the end of the experiment. The background reactivity of the chamber produced up to 10 ppb O 3 which is photolysed yielding OH in presence of water vapour.
In the flushed dark chamber the OH data of all instruments (FRCGC-LIF, FZJ-LIF-SAPHIR, and FZJ-DOAS) scattered around zero within the respective precisions. OH data of all instruments ranged between (2 and 3)×10 6 cm −3 during the 60 min of insolation of the first humidity step. After closing the louvre system all instruments detected zero OH in the chamber again. But during the following insolation period at 3.7 mbar H 2 O (dew point −7 • C) some differences between the instruments are observed. The FZJ-LIF-SAPHIR and the MPI-LIF measured lower OH level ((4-5)×10 6 cm −3 ), the FZJ-DOAS slightly higher, and the FRCGC-LIF the highest values ((7-8)×10 6 cm −3 ). However, for the next humidity level (up to 12.7 mbar H 2 O, dew point 10 • C) the measured OH concentrations of all instruments are very similar ((8-10)×10 6 cm −3 ). Also for the last step with the highest water concentration (up to 19.6 mbar H 2 O, dew point 16 • C) all instruments show identical OH concentrations within their precision. The highest average OH concentration measured throughout the campaign ((11-15)×10 6 cm −3 ) was seen during this last step despite decreasing photolysis frequencies because all major OH sources accumulated towards the end of the experiment while the concentration of organic trace gases that react with OH was very low. During the last two irradiation periods the fan was operated for 10 min each (11:10-11:20, 13:40-13:50), but no effect on the OH measurements was observed.

Test for interferences by NO x
On 18 July 2005 (2nd column in Fig. 4) (500-800) ppb CO, 20 ppb O 3 , and (3-6) mbar H 2 O were added to the chamber in order to assure conditions (background reactivity, humidity) that are relevant for field measurements. The NO x mixing ratio was changed in three steps (<0.22 ppb, 1.1 ppb, 3.5 ppb, and 8.8 ppb). Before the last step CO, O 3 , and H 2 O were added in order to compensate for the dilution by the replenishment flow. The HCHO and HONO concentrations reached 1.9 ppb and 190 ppt, respectively, towards the end of the experiment. The cycling between dark periods and insolation followed the scheme of the previous day. However, photolysis frequencies were lower because of a hazy sky and occasional clouds. The average OH concentration was considerably lower, mostly below 5×10 6 cm −3 as a consequence of the lower insolation, higher reactivity, and lower OH radical sources (less HONO and HCHO, but more O 3 and H 2 O). Like on the previous day, the instruments measured no significant OH concentrations during the dark periods and agree mostly during the insolation periods. During the last insolation period the fan was operated (13:35-13:40, 13:45-13:50, and 13:55-14:00). No change in OH concentration or scatter of the data caused by the enforced mixing or induced by the increased turbulence was observed for any of the instruments.

Test for interferences by ozone
Ozone was varied between 0 ppb and 150 ppb in steps of 50 ppb on 19 July 2005 (3rd column in Fig. 4). At the beginning of the experiment 17 ppb CO was present and 15 mbar H 2 O was added. NO x was (0.7-1.0) ppb. This day was partly cloudy and the temperature varied little (290 K-295 K). The HCHO concentration increased up to 2.9 ppb. The HONO production was first very large and the mixing ratio increased steeply during the first insolation period from 50 ppt to (450-500) ppt, but then decreased to reach 250 ppt at the end of the experiment.
During the first period, HONO was the most important OH source at a low OH reactivity, therefore the highest OH concentrations up to 10×10 6 cm −3 were measured by all instruments. The OH concentration during the following insolation periods was lower and highly variable because of the variable photolysis frequencies. On this day, the instruments show general good agreement within the precision of the data independent of the level of ozone. Interestingly, all instruments measured an increasing OH concentration different from zero in the dark chamber (no valid data of the MPI-LIF). The average OH concentration in the dark was found to be approximately 1×10 6 cm −3 at the end of the experiment. In order to test the contribution of OH produced and detected by the laser beam of the FZJ-DOAS the fan was operated during three intervals (07:50-07:55, 09:50-09:55, and 13:40-13:50) in addition to the periods of mixing during O 3 addition. But no significant change in the OH concentration was observed. Another test was conducted after the experiment by increasing the UV laser power to 4 mW during an interval without fan operation. The OH concentration measured by DOAS increased to maximum 4 ×10 6 cm −3 , therefore this interference is estimated to have been well below 1×10 6 cm −3 during the experiment.

Aging of Jülich ambient air
On 21 July 2005 the dark SAPHIR chamber was flushed with particle filtered ambient air. The intention of this experiment was to compare the OH instruments using outside air without local emissions. As shown in the first column in Fig. 5 the chamber volume was exposed to daylight two times: 07:00-09:02 and 10:00-12:00. The fan was turned on 10:40-10:50 to test homogeneity within the chamber.
The FZJ-DOAS instrument revealed, in addition to the absorbance by OH and HCHO, significant contributions by 2.5 ppb SO 2 and 60 ppt naphthalene (C 10 H 8 ). Both compounds are markers for fossil fuel combustion by several large, lignite-fired power plants near Jülich. Other combustion markers include 160 ppb CO and 14 ppb NO x . Benzene and toluene were about 0.5 ppb each and biogenic VOCs were below 0.2 ppb. HCHO was 1.3 ppb at the beginning of the experiment and increased to 3.3 ppb during the course of the two periods of insolation. The HONO concentration at the beginning of the experiment was approximately 250 ppt and increased to 490 ppt after the first insolation period and then decreased continuously to 290 ppt. Ambient air had 9 ppb O 3 , which increased up to 47 ppb during the second insolation. From 11:00 to 11:15, approximately 500 ppm of CO was added in order to completely scavenge OH.
During this mostly cloudy day with temperatures around 290 K the OH measurements were variable and mostly less than 5 ×10 6 cm −3 during the first period of insolation. The FRCGC-LIF detected up to 10×10 6 cm −3 of OH during the second insolation period, while other instruments showed approximately 6×10 6 cm −3 . After addition of CO the data of the FZJ-DOAS, the FZJ-LIF-SAPHIR, and the FRCGC-LIF are not significantly different from zero, while the data of the MPI-LIF shows a small offset of (7±2)×10 5 cm −3 . The offset showed up during insolation and therefore cannot be explained by the known artefact in the dark. It is likely caused by a small interference to HO 2 , which is detected in the MPI-LIF instrument downstream of the OH detection cell by chemical conversion with added NO. Given the high HO 2 concentrations of about 6×10 8 cm −3 in the SAPHIR chamber after CO addition, small amounts of NO contamination, for example, by backdiffusion, may have caused the small offset in the OH measurements. An interference of this magnitude, however, has little relevance for atmospheric conditions, where HO 2 /OH ratios are typically 10-100.

Ozonolysis of alkenes
This experiment was designed to form different, nearly constant HO 2 concentration levels by reacting alkenes with O 3 in the dark (second column in Fig. 5). Only very small steadystate concentrations of OH are expected, which makes the experiment sensitive to potential interferences due to HO 2 and reactive VOCs. After the addition of water vapour (9 mbar, dew point 5 • C) and 100 ppb O 3 the experiment was started by addition of 6 ppb pent-1-ene at 07:30. Another 15 ppb was added at 09:05 and the last addition of 25 ppb pent-1-ene was at 10:30. A second block of alkene injections followed in order to increase the OH yield and to test the upper range of HO 2 . Four 200 ppb injections of trans-2-butene were applied at 12:08, 12:34, 12:53, and 14:15. There was 70 ppb O 3 left during the first injection and O 3 was titrated by following alkene additions.
As noted before, only OH data of FRCGC-LIF and FZJ-DOAS can be compared on this day. A potential interference with O 3 of the DOAS instrument (Sect. 2.1.1) was counteracted by using a low UV laser power and by operation of the fan throughout the experiment. We estimate that it was below 2×10 5 cm −3 during this experiment. Very good agreement of OH measured by the two different techniques was found. Both instruments reported a non-zero OH concentration of (0.40±0.05)×10 6 cm −3 (FRCGC-LIF) and (0.47±0.10)×10 6 cm −3 (FZJ-DOAS), before trans-2-butene was added. No change of the OH concentration is observed when pent-1-ene is added and no influence of the increasing HO 2 levels is discernable during the first part of the experiment. However, the addition of a large amount of trans-2butene is reflected by a distinct rise in the OH concentration detected by both instruments. The last addition of alkene did produce no further increase in OH at the end of the experiment, because with the titration of O 3 the OH production ceased. The FRCGC-LIF measured up to 3.0×10 6 cm −3 of OH, while the FZJ-DOAS measured 1.9×10 6 cm −3 . After the last pent-1-ene addition HO 2 was in the range of 4×10 8 cm −3 . High levels of up to 38×10 8 cm −3 were created as measured by LIF after the third addition of trans-2-butene. The measured HO 2 /OH ratio was then approximately 2000. The FRCGC-LIF has an alternating measurement of OH and HO 2 and a tiny NO leak would form OH from HO 2 . This might explain the difference to the FZJ-DOAS measurement. However, it was confirmed during calibration of the FRCGC-LIF that a HO 2 /OH ratio of up to 500 can be measured without interferences.

Photooxidation of hydrocarbons
OH concentrations were measured in synthetic air with added hydrocarbons, including alkanes, alkenes and aromatic compounds. The following trace gases were added: water vapour (11 mbar, dew point 10 • C), NO (0.7 ppb), O 3 (17 ppb) and 6 different hydrocarbons (5 ppb benzene, 3 ppb 1-hexene, 2.5 ppb m-xylene, 3 ppb n-octane, 3 ppb npentane, and 1 ppb isoprene). The last formal chamber experiment is shown in the 3rd column in Fig. 5. Photochemistry was started by opening the louvre system (08:10), but sunlight was modulated by a broken cloud cover. Initially, up to 350 ppt of HONO were formed that later decreased to 180 ppt. Photooxidation of VOCs resulted in the production of up to 29 ppb O 3 and 4.3 ppb HCHO. HO 2 and RO 2 measured by LIF and MIESR, respectively, were in the range of (1.5-5.0)×10 8 cm −3 . The measurements of all instruments showed good agreement within the precision of the measurements in the dark and at daylight.

Correlation
The Pearson correlation coefficient r was calculated for the 300 s averaged OH data of each available instrument pair. The square of the correlation coefficient r 2 is a measure of how much OH variation measured by one instrument is also observed by the other. The correlation results and the number N of comparable data of each instrument pair are listed in Table 3.
Atmos. Chem. Phys., 9,2009 www.atmos-chem-phys.net/9/7923/2009/ Table 4. Result of the regression to the data (300 s mean values): the regression slope (b), the intercept (a, in units of 10 6 cm −3 ), and the sum of the squared residuals divided by the number of data points ( χ 2 N−2 ) which serves as a measure of the fit quality. b 0 is the ratio of the mean of two data sets (ȳ/x).

Ambient
Chamber The correlation coefficients r 2 of the 300 s-averaged, combined data sets in Table 3 range between 0.71 (FRCGC/MPI) and 0.96 (ambient CIMS/MPI), which includes both ambient and chamber measurements. These results indicate that between 71% and 96% of the OH variability measured by all instrument pairs is real. The results are similar for the ambient and the chamber measurements. The instruments can be ordered from high to low r 2 when the possible combinations of three instruments pairs are compared: DWD-CIMS, MPI-LIF, FZJ-DOAS, FZJ-LIF-ambient, FZJ-LIF-SAPHIR, FRCGC-LIF. This is basically also the order of the a-priori stated precision of the different instruments, when averaged over a common time step. Experimental data of each instrument has a statistical dispersion described by the precision of its OH measurement characteristics. The finite dispersion results in a r 2 <1.00 even if the variation is entirely explained by the precision. A Monte Carlo analysis was used in order to assess the influence of the precision as opposed to other potential nonstatistical errors. 1000 random data sets each were generated to determine the expected value r 2 µ that is likely obtained when a pair of data is identical, but each afflicted with the respective instruments precision that was randomly varied for each data point using a normal distribution. Because the resulting distribution is not Gaussian, a Fischer transformation was used to calculate its centre (r 2 µ ) and the 2.5% and 97.5% percentiles that are listed in the third subcolumn of Table 3. The experimental values of r 2 are completely in agreement with r 2 µ within the 2.5% and 97.5% percentiles except for two instrument pairs: The FZJ-LIFambient versus the DWD-CIMS and the MPI-LIF, respectively.
The lower than expected r 2 of these instrument pairs is possibly caused by an unknown systematic instrumental error or probing of different air influenced by local emissions. The latter possibility is favoured by the distance between the DWD/MPI and FZJ instruments that was larger than for other instrument combinations. On the other hand, the experimental r 2 of instrument pairs at the chamber is found to be always within the confidence intervals. This suggests that all instruments sampled correctly the same OH concentration that is expected in a homogeneous environment as provided by the SAPHIR chamber.

Regression
Linear regressions (y=a+b·x) were calculated for the six possible instrument combinations of the measurements at the field site and for the six combinations at the SAPHIR chamber. The regressions account for the statistical errors of both instruments (x-and y-axis) based on the algorithm "fitexy" proposed by Press and Teukolsky (1992). Additionally, the slopes of regressions with the origin forced through zero (y=b 0 ·x) were calculated, where b 0 corresponds to the ratio of the mean OH concentration measured by the respective instruments. The results are shown in Figs. 6 and 7 (without error bars, see Table 2 for average standard deviations) and in Table 4.
The regressions to the ambient data of three days are in general linear (Fig. 6). The regression between MPI-LIF and DWS-CIMS data, upper left panel in Fig. 6 and last row in Table 4, revealed the strongest deviation from unity slope (b=0.59±0.01 and b 0 =0.62±0.03), although the data agree extremely well on a relative scale. These instruments have the highest time resolution and the best precision at the imposed time resolution of 300 s, thus any systematic time dependent deviations would be easily detectable. However, both instruments measured invariably and precisely the same relative OH concentrations at the field site. Only on the first day of the ambient measurements (9 July 2005) the data appear to slightly deviate from linearity. This could either be caused by a small positive offset of DWD-CIMS during the foggy morning, or by a small but increasing offset of MPI-LIF in the course of the day. However, any potential offsets 7940 E. Schlosser et al.: Formal blind intercomparison of OH measurements Fig. 6. Linear regression to ambient OH measurements (averaged to 300 s intervals) with slope b (solid, black); linear regression forced through the origin with slope b 0 (solid, blue), unity slope for comparison (dashed). appear negligible in view of the consistent and precise measurements throughout the three days of ambient sampling. This implies that the systematic deviation from unity slope is not due to inhomogeneities in the air sampled by either instrument which could be caused by local emissions, but arises from a calibration difference. The deviation from unity slope is just within the limits of the combined calibration accuracies specified for the instruments (see Table 1: 32% (MPI) and 38% (DWD)).
The lower precision of the other instruments obscures relative sensitivity trends during the three days, but FZJ-LIFambient and FRCGC-LIF compare better than any of their combinations with other instruments: the slope of the regressions between FZJ-LIF-ambient and FRCGC-LIF, right panel in the middle row of Fig. 6 and row 6 in Table 4, is unity (b=1.06±0.02 and b 0 =0.95±0.06) although the data points are significantly more scattered than those of DWD-CIMS and MPI-LIF which show least agreement on an absolute scale. The slopes of all other regressions (see Table 4) are intermediate between these extremes.
Based on this observation and the correlation results, two groups of instruments can be identified that compared well at the field site: On one hand side DWD-CIMS and MPI-LIF, and on the other hand side FRCGC-LIF and FZJ-LIF-ambient. Only systematic inhomogeneities at this site would explain the existence of two distinct groups. Indeed, DWD-CIMS and MPI-LIF were located next to each other (5.5 m, see Fig. 1). FRCGC-LIF neighbored MPI-LIF (3.2 m) and FZJ-LIF-ambient (4.5 m) and both, FRCGC-LIF and FZJ-LIF-ambient, were downwind of the other two instruments. The intercepts of the regression lines are small compared to daytime OH values and range from (−0.04±0.03)×10 6 cm −3 (FZJ-LIF/DWD-CIMS) to (−0.63±0.15)×10 6 cm −3 (FRCGC-LIF/MPI-LIF). The intercepts of some instrument combinations are statistically significant, which may partly result from having two systematically differing groups of instruments. The slightly larger OH concentration measured by FRCGC-LIF relative to FZJ-LIF-ambient and DWS-CIMS in the night of 10-11 July 2005 (Fig. 3) is another possible contribution. If the regression parameter χ 2 listed in Table 4 is in the range of the number of data (χ 2 ≈N−2), i.e. the ratio χ 2 /(N−2)≈1, then the residual variation is explained by the precision of both instruments. This is indeed the case for all instruments except for the ambient measurements involving FZJ-LIF-ambient and MPI-LIF or DWD-CIMS (χ 2 /(N−2)≥4).
For FZJ-LIF-ambient the scatter of the data is not explained by the calculated measurement errors. But it is more in line when FZJ-LIF is compared with FRCGC-LIF, yielding χ 2 /(N−2)=1.4. Most likely this is caused by the systematic difference between the two groups of instruments, in agreement with the findings of the correlation analysis of r 2 .
The regression analysis of the OH data measured during six days in the SAPHIR chamber indicates very good agreement for all OH instruments for all days (Fig. 7, Table 4). In fact, the slopes of the regression lines deviate no more than 12% from unity for all instrument combinations, which is better than expected from the stated accuracies.
It should be noted that half of the dynamic OH concentration range is determined by two days (17 and 19 July 2005). Data of 22 July 2005 is missing for all instrument pairs except for FZJ-DOAS and FRCGC-LIF. The slopes (b and b 0 ) calculated for chamber data of all six days agree within the error margins for all instruments, suggesting negligible offsets between different instruments. This is also demonstrated by the calculated intercepts of the regression lines which are not significantly different from zero. The values calculated for χ 2 /(N−2) are 1.1 to 1.8 and good for experimental data. The residual variation is mostly explained by the measurement errors and the OH data sets agree quantitatively. This implies that the instruments sampled the same OH concentration and it also demonstrates that SAPHIR offers a homogeneous air composition suitable for instrumental intercomparisons.

Comparison of ambient and chamber results
Few instruments provided data that allows to compare the results from the ambient and chamber intercomparisons. MPI-LIF and FRCGC-LIF were the only instruments that measured both in ambient and chamber air. Furthermore, FZJ-LIF-ambient and FZJ-LIF-SAPHIR, which are technically similar and share the same calibration unit, measured in ambient and chamber air, respectively. All LIF instruments showed very good agreement among each other in the SAPHIR chamber and in comparison with the calibrationindependent DOAS instrument. In ambient air, however, the slope of FRCGC-LIF/MPI-LIF was larger by 25% than in the chamber, the slope of FZJ-LIF/MPI-LIF larger by about 17%, while the corresponding slope of FRCGC-LIF/FZJ-LIF was larger by about 20%. As discussed before, inhomogeneous air has probably influenced the slopes of MPI-LIF versus FRCGC-LIF and FZJ-LIF in ambient air, but there is no such indication for FRCGC-LIF versus FZJ-LIF. This suggests that sensitivity changes may have occurred in ambient air for the LIF instruments, which may be in the order of 20% and are not accounted for by the calibration procedures. It is not possible to resolve the differences between the OH measurements in ambient air since no ambient DOAS measurements are available as absolute reference.

Interferences
Trace gases that are known to interfere with OH measurements (e.g. LIF quenching by water vapour) are routinely accounted for in the data evaluation as has been outlined in Sect. 2.1 for the respective instruments. The first four days of chamber measurements were used to check the validity of these corrections and to reveal potential unknown interferences of other trace gases by varying the concentrations of H 2 O, O 3 , NO x , RO x , and VOCs. The FZJ-DOAS data was chosen as reference because of its high accuracy.
Since the chemical conditions inside the chamber were changed in the periods when the louvre system was closed, measurements during these periods were excluded from this analysis. The residuum values ( OH) of the regression of LIF versus DOAS data, OH, were binned for each insolation period and plotted as a function of the corresponding concentrations of H 2 O, NO x , O 3 , HO 2 (Fig. 8). The minimum, 25%-quartile, median, 75%-quartile, and maximum were calculated for each bin and are presented as box whisker plots. Positive values of OH indicate that a LIF instrument measured relatively higher OH concentrations than the DOAS instrument.
The plots of the first column of Fig. 8 show the analysis with respect to different absolute humidity levels. The scatter of OH is large because of the combined precision of two instruments. For all, but the second humidity level (3.6 mbar H 2 O) no large deviation is found. Compared to the DOAS measurement, the MPI-LIF measured systematically lower OH concentrations (−1.6×10 6 cm −3 ) whereas the FRCGC-LIF measured 1.2×10 6 cm −3 higher OH concentrations for the same humidity level. This deviation is unexpected, because it is unrelated to the water concentration and because inhomogeneity inside the chamber is unlikely. Therefore, it must be attributed to a temporal instability of the OH sensitivity of these two instruments. Overall, no systematic trend regarding a potential cross sensitivity to water vapour is observed. The OH sensitivity of the LIF instruments was successfully corrected for the increase in the quenching rate by increasing mixing ratios of water vapour.
The differences between DOAS and LIF are investigated with regard to different NO x levels as shown in the second column of Fig. 8. The OH concentrations of this cross sensitivity test were lower than for the other tests. The data does not reveal any trends and no cross sensitivity to NO x on the measurements of any instrument can be detected.
OH interference by laser photolysis of ozone has been a severe problem in atmospheric OH measurements in the past (Smith and Crosley, 1990), but is assumed to be essentially eliminated in current OH laser instruments. This is confirmed by a corresponding interference test on the third day of chamber experiments (see Sect. 3.4.3). Figure 8 shows no significant differences between the LIF instruments and DOAS, and no trend is observed even when ozone was increased up to 143 ppb.
The experiment with ambient air was used to investigate a potential HO 2 interference. During the second part of the experiment CO was added to scavenge OH and produce HO 2 . Only two bins were used here, the first one with HO 2 concentrations below 0.5×10 8 cm −3 , the second one at (6±2)×10 8 cm −3 . The MPI-LIF did measure OH concentrations (7±2)×10 5 cm −3 after the addition of CO in order to completely scavenge OH, as discussed in Sect. 3.4.4. But considering the precision of this analysis, this potential in- Fig. 8. The residual differences of OH data measured by the three different LIF instruments and FZJ-DOAS versus variable water vapour, NO x , O 3 , and HO 2 concentrations. The box whisker plots indicate minimum, 25%-quartile, median, 75%-quartile, and maximum.
terference cannot be confirmed. For none of the LIF instruments a significant influence of the HO 2 concentration on the OH measurement can be detected for conditions relevant for the atmosphere.

Conclusions
HOxComp was the first formal, blind intercomparison campaign of OH measurements which involved six different instruments (4 LIF, 1 CIMS, and 1 DOAS) operated by Japanese and German groups. It covered three days of measurements in ambient air and six days of measurements in the atmosphere simulation chamber SAPHIR. The ambient conditions were moderately polluted with substantial levels of biogenic VOCs. In this work we attained a number of findings which we think are of importance for the interpretation of past, present, and future OH measurements: -Intercomparisons of radical measurements in ambient air are very demanding and error sources cannot be fully controlled. This was already encountered during previous experiments (i.e. TOHPE and POPCORN: Mount et al., 1997;Hofzumahaus et al., 1998). Here, it cannot be excluded that nearby buildings and local emissions might have influenced the quality of the intercomparison.
-The SAPHIR simulation chamber proved to be a valuable platform for the intercomparison, as has been demonstrated before Apel et al., 2008). The chamber overcomes the problem of sampling inhomogeneities which cannot be excluded in an open environment.
-All instruments in this study can measure OH radicals at the levels encountered in the troposphere. The recorded time series of the instruments are highly correlated; the correlation coefficients are well within the confidence bands calculated from the a-priori stated precisions of the individual instruments.
-The absolute intercepts of pairwise linear regressions never exceeded 0.6×10 6 cm −3 , mostly being insignificant. Since some low OH data recorded in the dark had to be excluded from the analysis, it is not possible to fully address the questions of nighttime OH. Nevertheless, this study shows, that for daytime OH measurements (at levels between 1×10 6 cm −3 and 1.5×10 7 cm −3 ) offsets in the data are most likely of minor importance.
-The slopes of the pairwise linear regressions were between 1.06 and 1.69 for the ambient part and between 1.01 and 1.13 for the chamber part of the campaign. The chamber slopes are well within the margins set by the accuracies of the individual instruments. We found evidence that sampling inhomogeneities cannot be the only cause of the wider range of the ambient slopes. It is concluded that calibration problems are most likely involved.
-In the SAPHIR chamber we could assess the question of interferences by water, ozone, nitrogen oxides, and peroxy radicals under well-defined conditions. At the significance level of this study we did not find any cross sensitivities in addition to those which are routinely accounted for in the data evaluation of the individual instruments. This shows how well the instruments were designed and characterised before the campaign.
The ambient air part of this study was performed under moderately polluted conditions while the chamber part, without CIMS, covered a higher variability of chemical conditions. Also we focussed here on daytime measurements. This study explored only a subset of possible conditions where ambient OH measurements are needed. Nonetheless, this OH intercomparison provides evidence for the high quality standard of the current DOAS-, LIF-, and CIMS-based OH measurement techniques. All participating instruments provided highly time-resolved OH data without significant interferences and offsets during daytime measurements.
Generally, water photolysis is a suitable OH source for the calibration. However, the stability and accuracy of the current calibration devices is still a major source of uncertainty in OH measurements. Thus, we encourage the development of a robust portable OH calibration standard fitting the majority of current OH instruments to overcome this problem.
Intercomparisons under well controlled conditions are the best way to ensure the quality of atmospheric OH radical measurements. Future intercomparisons should cover a larger range of parameters, e.g. nighttime or high VOCs, conditions where the understanding of the HOx chemistry is under discussion Hofzumahaus et al., 2009).

Appendix A Calibration
The participating groups apply the same principle of producing quantitative amounts of OH by photolysis of water vapour at 185 nm for calibration of the CIMS and the LIF instruments (Sect. 2.1.4). Technical details differ and are described briefly in this section.
The DWD-CIMS has a calibration unit built into the 10 cm diameter air inlet tube. OH radicals are produced during ambient air sampling from photolysis of atmospheric water vapour by switching on a mercury lamp, which is placed in front of the sampling nozzle. The fast flow rate ensures that radicals are well-mixed within the turbulent air stream and radical losses between production and sampling point are negligible. Typical OH concentrations, which can be produced with this method, are within the range of (15-35)×10 6 cm −3 . OH concentration values are calculated from the UV light flux, which is accurately measured by a solar blind VUV cathode, and concurrent ambient H 2 O measurements (Berresheim et al., 2000). The VUV cathode is calibrated by PTB (Physikalisch-Technische Bundesanstalt, Braunschweig, Germany) every year with an accuracy of 4% and the distribution of the UV radiation is measured in regular intervals (typically 4 weeks, 4 times during HOxComp).
All LIF instruments use removable calibration sources, which consist of a flow tube and an illumination unit at the end of the flow tube that is placed immediately in front of the sampling nozzle. The radical sources are supplied with humidified synthetic air at a high flow rate. The MPI-LIF calibration unit uses a high flow rate of 50 slm and an average streaming velocity of 3.6 m s −1 to ensure turbulent flow and thus a flat velocity profile. The other two groups (FZJ and FRCGC) use calibration units, which exhibit laminar flow in a cylindrical flow tube (diameter: 19 mm and 26 mm, length: 680 mm and 500 mm, respectively) at a flow rate of 20 slm. Radical losses between production and sampling point were characterised for the MPI-LIF source in laboratory experiments. They are negligible in the case of laminar flow tubes as used in the FRCGC and FZJ sources.
Actinometry with either O 2 /O 3 (FRCGC, FZJ) or with N 2 O/NO (MPI) is applied to determine absolute OH concentrations instead of directly measuring the actinic flux. O 2 /O 3 actinometry takes advantage of the photolysis of oxygen (at 184.9 nm) that occurs simultaneously to the photolysis of water vapour leading to the formation of ozone: The actinic flux can be substituted by the formed O 3 concentration (Schultz et al., 1995). O 3 is measured in the excess gas of the FRCGC radical source during calibration. In this case the difference between the O 3 concentration in the excess gas and the sampled gas has to be taken into account, because the velocity profile of the gas in the flow tube is not flat for laminar flow conditions. The center part of the flow, which is sampled by the instrument, is faster and thus has a shorter residence time within the illuminated zone. The ratio between the O 3 concentration in the sampled gas and in the excess gas is determined in laboratory experiments. This factor is applied during the calibration procedure. In the FZJ radical source the intensity of the mercury lamp, which provides the 184.9 nm radiation, is monitored during calibration by a phototube. The light intensity of the mercury lamp versus the ozone concentration in the sampled air is regularly characterised in laboratory experiments and thus gives a measurement of the ozone in the sampled gas. The MPI radical source has been characterised by N 2 O/NO actinometry (Faloona et al., 2004;Martinez et al., 2008). The photolysis of N 2 O (at 184.9 nm) yields NO, which can easily be measured by chemiluminescence. As for O 2 /O 3 actinometry the measured NO concentration yields the actinic flux. The large N 2 O concentration required to produce NO concentrations, which can be accurately measured by a chemiluminescence detector, partially absorbs the 184.9 nm radiation. Measurements are corrected for this effect.
The design of the calibration unit of the MPI used to determine the actinic flux of the UV lamp incorporates a narrow (3 mm) reaction chamber and a high gas velocity (1 slm) in order to secure a turbulent flow profile and that the effect of absorption by N 2 O is negligible. The gas flow, the N 2 O concentration, and its carrier gas (N 2 or He) are varied as a control. A second, separate calibration unit is used for the OH calibration of the MPI-LIF and is a flow tube with a squared cross section of 16 mm×16 mm.