Simple proxies for estimating the concentrations of monoterpenes and their oxidation products at a boreal forest site

The oxidation products of monoterpenes likely have a crucial role in the formation and growth of aerosol particles in boreal forests. However, the continuous measurements of monoterpene concentrations are usually not available in decadal 10 time scales, and the direct measurements of the concentrations of monoterpene oxidation product are so far scarce. In this study we developed proxies for the concentrations of monoterpenes and their oxidation products at a boreal forest site in Hyytiälä, southern Finland. For deriving the proxies we used the monoterpene concentration measured with a proton transfer reaction mass spectrometer (PTR-MS) during 2006–2013. Our proxies for the monoterpene concentration take into account the temperature-controlled emissions from the forest ecosystem, the dilution caused by the mixing within the boundary layer, 15 and different oxidation processes. All the versions of our proxies captured the seasonal variation of the monoterpene concentration, the typical proxy-to-measurements ratios being between 0.8 and 1.3 in summer and between 0.6 and 2.6 in winter. In addition, the proxies were able to describe the diurnal variation of the monoterpene concentration rather well, especially in summer months. By utilizing one of the proxies, we calculated the concentration of oxidation products of monoterpenes by considering their production in the oxidation and their loss due to condensation on aerosol particles. The 20 concentration of oxidation products was found to have a clear seasonal cycle with the maximum in summer and the minimum in winter. The concentration of oxidation products was lowest in the morning or around noon and highest in the evening. In the future, our proxies for the monoterpene concentration and their oxidation products can be used, for example, in the analysis of new particle formation and growth in boreal environment.


Introduction 25
Terrestrial ecosystems emit large amounts of biogenic volatile organic compounds (BVOCs) into the atmosphere (Guenther et al., 2012), where they are oxidized forming less volatile vapors. In boreal forests BVOC emissions are typically dominated by monoterpenes (Hakola et al., 2006;. Recent studies have shown that the low volatility oxidation products of monoterpenes may participate in atmospheric particle formation and growth, and thus affect the aerosol-radiation interactions and the concentrations of cloud condensation nuclei in the atmosphere (Kulmala et al., 1998;O'Dowd et al., 2002;Kulmala et al., 2013;Paasonen et al., 2013;Ehn et al., 2014;Jokinen et al., 2015). Therefore, the knowledge of the concentrations of monoterpenes and their oxidation products is crucial when estimating the climate effects of aerosol particles.
The total concentration of monoterpenes in the boundary layer can be measured using online techniques such as a proton transfer reaction mass spectrometer (PTR-MS;Taipale et al., 2008), or by collecting air samples and analyzing them with gas chromatography which also separates the different monoterpenes from each other (Hakola et al., 2003). Monoterpene 5 concentration in the boreal forest has been observed to be lowest in winter (below 0.1 ppbv) and highest in summer (>0.25 ppbv) (Hakola et al., 2009;Lappalainen et al., 2009). The summertime maximum in the concentration results from the fact that the emissions of monoterpenes from the vegetation are highest in summer, as they are mainly controlled by temperature and linked to plant activity (Tarvainen et al., 2005;Hakola et al., 2006;Rantala et al., 2015). In some studies measured monoterpene concentrations have been used for estimating the concentration of their oxidation products (Lehtipalo et al., 10 2011). However, monoterpene concentration data is often available only for short measurement periods, and thus they are not always suitable for the analysis of long-term data sets. Furthermore, the recent development of chemical ionization mass spectrometry techniques has enabled the direct measurements of monoterpene oxidation products but these data are still very scarce (Ehn et al., 2014;Jokinen et al., 2015).
Due to the limited amount of data on the concentrations of monoterpenes and their oxidation products, in some studies they 15 have been estimated by using simple proxies. One proxy for monoterpene concentration was obtained by parametrizing measured monoterpene concentration as a function of air temperature (Lappalainen et al., 2009). This proxy has been utilized for calculating the oxidation products of monoterpenes from reactions with hydroxyl radical (OH) and ozone (O3) . However, this earlier approach has several limitations: 1) only daytime values of measured monoterpene concentration were used for the parametrization; 2) the mixing within the boundary layer, diluting monoterpene concentration, 20 was not considered; 3) the oxidation of monoterpenes by nitrate radical (NO3), a major loss mechanism of monoterpenes at night (Peräkylä et al., 2014;Mogensen et al., 2015), was not included. Therefore, this proxy is not able to describe the diurnal variation of the concentrations of monoterpenes and their oxidation products.
In this study, we construct improved proxies for the concentration of monoterpenes and their oxidation products at a boreal forest site in Hyytiälä, southern Finland. Our proxies for monoterpene concentration include both biological, physical, 25 meteorological, and chemical processes: the temperature-driven emissions of monoterpenes, the dilution of the concentration caused by the mixing within the boundary layer, and the oxidation of monoterpenes by O3, OH and NO3. For deriving these proxies we use monoterpene concentration measured in Hyytiälä during 2006-2013. To assess the performance of the novel proxies, we compare different versions of the proxy to the measured monoterpene concentration, and investigate how well the proxies are able to describe the observed seasonal and diurnal variation of monoterpene concentration. Finally, we use one of 30 the monoterpene proxies to calculate the concentration of the oxidation products of monoterpenes in Hyytiälä during 1996-2014 and investigate its seasonal and diurnal cycle.

Measurements
The measurements were performed during 2006-2013 at the SMEAR II station in Hyytiälä, southern Finland (Hari and Kulmala, 2005). The station is located in the southern boreal vegetation zone, and it is surrounded by a rather homogeneous Scots pine (Pinus sylvestris) forest (Ilvesniemi et al., 2009;Bäck et al., 2012). 5 For constructing the proxy for monoterpene concentration, we used the volume mixing ratios of monoterpenes measured with a proton transfer reaction quadrupole mass spectrometer (PTR-MS; Ionicon Analytik BmbH, Austria) (Taipale et al., 2008).
The PTR-MS was maintained at a drift tube pressure of 1.95-2.20 mbar. The primary ion signal (H3O + ) varied between 1 and 30×10 6 cps, being typically around 10×10 6 cps. With these settings, the E/N ratio where E is the electric field and N the number density of the gas in the drift tube, varied between 105 and 125 Td (Td = 10 -21 V m -2 ). The instrumental background was 10 determined every second or third hour with a zero-air generator (Parker ChromGas Zero Air Generator, model 3501, USA), and the instrument was calibrated every 2-4 weeks using an alpha-pinene standard gas (Apel-Riemer Environmental Inc., USA, or Ionimed GmbH, Austria) which was diluted to around 1-5 ppbv. The monoterpene concentrations were derived from the measured m/z (mass-to-charge ratio) 137 signal according to Taipale et al. (2008). Shortly, the measured signal was first normalized using measured H3O + and H2OH3O + signals, and the drift tube temperature and pressure. Then, the normalized 15 signal was converted to the volume mixing ratio using a normalized instrumental sensitivity. The measurements were conducted close to the forest canopy at the height of 14 m ( -2009( ) or 16.8 m (2010( ) (Taipale et al., 2008Rantala et al., 2015). Until March 2007 the measurements were performed every second hour and after that every third hour. The measurements were not conducted continuously during the years 2006-2013 but, especially in the beginning, only during intensive measurement campaigns. The number of data points (1-hour averages) obtained for each month are presented in 20 Table 1, which shows that there are more data available in summer and spring months than in autumn and winter. To reduce the effect of anthropogenic pollution episodes on monoterpene concentration, the data during time periods when the wind direction corresponded to the direction of the nearby sawmill were omitted from the analysis (Liao et al., 2011).
For calculating the proxy, the concentrations of ozone (O3) and nitrogen oxides (NO and NOx) were utilized. O3 concentration was recorded with an ozone analyzer (TEI 49C, Thermo Fisher Scientific, Waltham, MA, USA) based on the absorption of 25 UV radiation. NO and NOx concentrations were measured with a chemiluminescence analyzer (TEI 42C TL, Thermo Fisher Scientific, Waltham, MA, USA). NO2 concentration was calculated by subtracting NO concentration from NOx concentration.
The 30-minute averages of O3, NO, and NO2 concentrations measured at the height of 16.8 m were used in the analysis.
In addition, the 30-minute averages of UV-B radiation, temperature and wind speed were used in the calculations. UV-B radiation was measured with a SL 501A pyranometer (Solar Light, Philadelphia, PA, USA) at the height of 18 m. Temperature 30 was measured with a PT-100 sensor at 16.8 m height. Wind speed was measured at the height of 16.8 m with a cup anemometer (A101M/L, Vector Instruments, Rhyl, Clwyd, UK) until September 2003 and with an ultrasonic anemometer (Ultrasonic anemometer 2D, Adolf Thies GmbH, Göttingen, Germany) after that. Furthermore, ECMWF (European Centre for Medium-Range Weather Forecasts) reanalysis data were used for determining the boundary layer height (BLH) in Hyytiälä.
Finally, when calculating the oxidation products of monoterpenes, condensation sink (CS), describing the loss rate of vapors due to the condensation on aerosols particles, was calculated from the particle size distribution data .
Particle size distributions were measured with a Differential Mobility Particle Sizer (DMPS; Aalto et al., 2001) at the ground 5 level.

Proxy for monoterpenes
The concentration of monoterpenes in the boundary layer is determined by various physical, chemical, meteorological and biological processes. In Hyytiälä the main source of monoterpenes is the emissions from the forest ecosystem, which are 10 largely controlled by air temperature (Guenther et al., 1993). The most important sink of monoterpenes is their oxidation by O3, OH and NO3 radicals (Atkinson and Arey, 2003). In addition, the monoterpene concentration is strongly affected by dilution caused by the mixing of the boundary layer.
For monoterpene emissions we used a temperature (T) -dependent exponent function, which has been observed to describe the monoterpene emissions well in Hyytiälä (Tarvainen et al. 2005;Lappalainen 2009 Here and are empirical parameters and is 303.15 K.
The sink of monoterpenes due to oxidation by O3, OH and NO3 can be calculated from Herek OH+MT , k O3+MT and k NO3+MT are reaction rate coefficients between monoterpenes and different oxidants. To obtain the 20 correct diurnal cycle for the reaction rate coefficients, we used temperature-dependent relations for alpha-pinene (Atkinson et al., 2006;see Table A1 in Appendix). Alpha-pinene is the most abundant monoterpene in Hyytiälä during summer but also delta-3-carene, camphene, limonene and beta-pinene contribute significantly to the total monoterpene concentration (Hakola et al., 2009;Bäck et al., 2012;Hakola et al., 2012). In winter, camphene has, on average, the highest concentration, followed by alpha-pinene . To take into account the seasonal variation of reaction rate coefficients caused by the 25 changes in the composition of monoterpenes, we utilized the monthly-mean reaction rate coefficients presented by Peräkylä et al. (2014). (3) 30 Measuring of NO3 concentration is challenging and has been conducted in Hyytiälä only for a short time period during which NO3 mixing ratios were mostly below the detection limit of the instrument (Williams et al., 2011;Mogensen et al., 2015). Therefore, we estimated the concentration of NO3 in a similar way as was done by Peräkylä et al. (2014). A steady state between the production of NO3 in the reaction between O3 and NO2 and the removal of NO3 was assumed: (4) 5 Here k O3+NO2 is the temperature-dependent reaction rate coefficient between NO2 and O3, which was calculated from a temperature-dependent relation (Atkinson et al., 2004; see Table A1). τ 3 is the lifetime of NO3.
During daytime NO3 is efficiently removed in the photolysis, and thus we assumed for it a lifetime of 5 s for all times when UV-B radiation was higher than 0.01 W m -2 (Peräkylä et al., 2014). The lifetime during nighttime was calculated from Here k NO3+MT is the reaction rate coefficient between monoterpenes and NO3, and k NO3+NO between NO3 and NO, for which temperature-dependent relations were used (Atkinson et al., 2004(Atkinson et al., , 2006; see Table A1). K is the equilibrium constant for the reaction between NO3 and NO2 producing N2O5, which was calculated from the relation K = 5.1×10 -27 exp(10871/T) (Osthoff et al., 2007). k N2O5+H2O is the reaction rate coefficient between N2O5 and water vapor for which the value of 2.5×10 -22 cm 3 s -1 was used (Atkinson et al., 2004). In reality, NO3 reacts also with other VOCs, such as isoprene, but in a pine forest in Hyytiälä 15 their contribution to the lifetime is only minor compared to the reactions with monoterpenes (Peräkylä et al., 2014).
Furthermore, when calculating the lifetime of NO3, Peräkylä et al. (2014) also considered the heterogeneous uptake of N2O5 by aerosol particle surfaces but we omitted that process from our calculations due to its minor effect on the lifetime according to their study.
By combining Eqs (1) and (2), and including the effect of the mixing within the boundary layer by using the mixing layer 20 height (BLH) and wind speed (ws), the equation for the ideal monoterpene proxy, including all the processes, can be written as: The values for empirical parameters a and b and the functional forms of f(BLH) and f(ws) were determined as follows. First, an initial value for the parameter b, 0.09 K -1 , was obtained from the literature (Guenther et al., 1993). The BLH dependence 25 was then inspected by plotting the ratio of the measured monoterpene concentration to the calculated steady-state concentration (the first term on the right hand side of Eq. (6)) as a function of BLH. The different forms of dependences between the ratio and BLH were tested, and the power-law form f(BLH)=BLH c (for BLH values above 100 m) was found to describe the relation best ( Fig. 1a). Next, the ratio of the measured monoterpene concentration to the product of the term BLH c (the value for c was fitted as in Fig 1a, for BLH<100 m the value of 100 c was used) and the steady state concentration (the first term on the right 30 hand side of Eq. (6)) was depicted against wind speed (Fig 1b). The power-law form was found to be most suitable also for this dependence and an initial value for the parameter d in f(ws)=ws d was extracted from the fitting. The effect of RH was tested in a similar manner, but no dependence was found. Consequently, the equation for the ideal form of the monoterpene proxy becomes: After determining the initial values for the parameters b, c and d, they were optimized by minimizing the variability of the 5 data-point specific ratios between the proxy and the measurements, with the method presented by Paasonen et al. (2010). The variability was determined as the ratio between 90th and 10th percentiles (V90/10) of the proxy to measurement ratios (see an illustration of the meaning of V90/10 in Fig. 2). The optimization was done with the Matlab-function fminsearch by searching for the values of b, c and d yielding the smallest variability, i.e. V90/10. The initial values for b, c and d in the fminsearch-script were set to the values determined as described after Eq. (6). However, we also varied the initial values to see if the obtained 10 parameters would present a local minimum in the variability. Finally, the value for the parameter a was determined as the geometric mean value of the ratio between the measured and proxy concentrations. The values obtained for the empirical parameters are presented in Table 2. Note that we used only data recorded during March-November for determining the parameters, thus excluding the data from winter months when biogenic emissions of monoterpenes are low. For finding the optimal parameters, we chose to minimize V90/10 instead of, for example, maximizing the correlation coefficient, because with 15 the chosen method the proxy concentrations are optimized towards one-to-one response to the measured concentrations. A higher value for the correlation coefficient between the proxy and measurements could be obtained with some other parameter values, but they might lead to a physically unsound non-linear dependence between the proxy and measurements.
The disadvantage of the ideal version of the proxy presented by Eq. (7) is that monoterpene concentration is needed for calculating NO3 concentration. Thus, this proxy is not useful in reality, as it cannot be used for estimating monoterpene 20 concentration for times without monoterpene measurements. In order to overcome this problem, we modified the ideal proxy in terms of how NO3 concentration is dealt with. The simplest way is to neglect the oxidation of monoterpenes by NO3, in which case the proxy becomes Here a1, b1, c1, and d1 are empirical parameters, which were determined as explained above (see Table 2). 25 The oxidation of monoterpenes by NO3 presents a significant loss for monoterpenes during nighttime in Hyytiälä (Peräkylä et al., 2014;Mogensen et al., 2015) and therefore this mechanism should ideally be included in the monoterpene proxy. Thus, we tested a proxy in which we used a constant value of 4.3×10 9 cm -3 for monoterpene concentration when calculating the lifetime of NO3 from Eq. (5). The constant value was obtained by calculating the median of monoterpene concentration measured at night. In this way, the second version of the proxy was obtained, now including the oxidation by NO3: 30 × BLH c 2 × ws 2 The values of the empirical parameters a2, b2, c2, and d2 are presented in Table 2.
Finally, we applied an iterative method and calculated the NO3 lifetime by using the monoterpene proxy obtained from Eq.
(9). This way, we obtained the third version of the monoterpene proxy: The values of the empirical parameters a3, b3, c3, and d3 are shown in Table 2. 5 Additionally, we tested a simplified version of the proxy by including only the monoterpene emissions and the mixing of the boundary layer, and omitting the sink due to oxidation. In this case the proxy becomes: The values of the empirical parameters a4, b4, c4, and d4 are presented in Table 2.
It needs to be noted that because of the interrelations between the diurnal and annual cycles of temperature, BLH, wind speed 10 and the concentrations of OH, O3 and NO3, the values obtained for the empirical parameters depend always on the other factors in the proxy. For example, the value optimized for the parameter d4 in the simplest version of the proxy differs significantly from the parameters d derived for the other proxies (see Table 2) presumably because of the unaccounted diurnal and/or seasonal cycles of the oxidant concentration in the simple proxy.
Furthermore, when studying the correlation between the proxies and measurements (see Sect. 3.1.1), we observed that MTproxy1, 15 obtained from Eq. (8), often overestimates the monoterpene concentration in winter. This can be clearly seen when plotting the ratio between MTproxy1 and measured monoterpene concentration as a function of the day of the year (DOY) (Fig. 3). For other proxies this overestimation was not as clear. The overestimation of MTproxy1 in winter time can be explained by the fact that MTproxy1 does not include the sink due to the oxidation of monoterpenes by NO3. On the other hand, it can also be related to the seasonal variation of the emission potential of vegetation, described by the coefficient a in our proxy (see also Tarvainen 20 et al., 2005;Aalto et al., 2015). To improve the seasonal variation of MTproxy1, we fitted a DOY-dependent function to the ratio between MTproxy1 and measurements (the red line in Fig. 3). Then, the corrected proxy was calculated from Here parameters h, l and m have values of 0.38, 0.57 and 0.13.

Proxy for monoterpene oxidation products 25
The concentration of oxidation products of monoterpenes was calculated based on their production in the reactions between monoterpenes and different oxidants and their removal by condensation on existing aerosol particles. The production rate can be calculated when knowing the concentrations of monoterpenes and different oxidants and the reaction rates between them.
The condensation sink (CS) can be calculated from the aerosol size distribution . Thus, by utilizing the proxies for monoterpene concentration derived in the previous section, the concentration of oxidation products of monoterpenes was obtained from Herek OH+MT , k O3+MT and k NO3+MT are reaction rate coefficients between monoterpenes and different oxidants, which were calculated as explained in the previous section, below Eq. (2). is the concentration of monoterpenes based on the 5 selected monoterpene proxy. It should be noted that OxOrg can be thought to represent the total concentration of oxidized monoterpenes, because it takes into account all the generations of oxidation products, from the first oxidation until condensable molecules. However, as the formulation of this proxy presumes that oxidation takes place relatively fast and that there are no others sinks than condensation sink, it should be considered as a rough estimate for the concentration of condensable organic vapors. 10 3 Results and discussion

Monoterpene proxy
The time series of the measured monoterpene concentration and different monoterpene proxies for the whole year 2013 and one week in September 2013 are illustrated in Fig. 4. All the proxies can be observed to follow the measured monoterpene 15 concentration well in an annual scale, capturing the build-up of the concentration in spring, the maximum in summer, and the decrease in the concentration in autumn. In addition, the proxies seem to describe the daily variation of the concentration adequately. In this section, the ability of different monoterpene proxies to produce the seasonal and diurnal variation of monoterpene concentration is discussed in more detail. Figure 5 shows the correlation between different monoterpene proxies and the measured monoterpene concentration using data from the years 2006-2013. All the proxies correlated well with the measurements. One of the highest correlation coefficients (R = 0.74, V90/10 = 6.6) was obtained for MTproxy,ideal, in which the measured monoterpene concentration was used for calculating NO3 concentration. This suggests that our equation for the proxy is plausible and considers the dynamics of the most important factors affecting the concentrations. On the other hand, from the true proxies, not using the monoterpene 25 measurements, the best correlations were obtained for MTproxy,simple (R = 0.73, V90/10 = 5.8) and MTproxy1 (R = 0.70, V90/10 = 7.0). Furthermore, for MTproxy1,doy, a DOY-dependent version of MTproxy1, the correlation coefficient was even higher, and the variation of the ratio between the measured and proxy concentrations, described by V90/10 value, lower (R = 0.74, V90/10 = 5.8).

Correlation between proxies and measurements 20
MTproxy,simple does not include any oxidation losses of monoterpenes, and MTproxy1 and MTproxy1,doy include only the oxidation by OH and O3. The fact that the highest correlation coefficients were still obtained for these proxies indicates that estimating the oxidation losses, without using the measured monoterpene concentration when calculating NO3 concentration, introduces significant uncertainty into the proxy. From the proxies including the oxidation by NO3, the correlation was stronger for MTproxy2 (R = 0.68, V90/10 = 6.9) than for MTproxy3 (R = 0.65, V90/10 = 9.2). In addition, for MTproxy3 V90/10-value was clearly higher than for any other proxy. This further suggests that when iteratively using the monoterpene proxy for calculating NO3 5 concentration, as is done in MTproxy3, the errors accumulate making the final proxy uncertain (overestimated monoterpene concentration leads to underestimation in NO3 concentration, and thus in oxidation sink, which further increases the calculated monoterpene concentration).
It needs to be noted that there are more measured monoterpene concentration data available in spring and summer time than in other times of year (Table 1), and thus those data affect the correlation most when the whole data set is used. To investigate 10 how well the proxies perform at different times of year, we studied the correlation between proxies and the measured monoterpene concentration in different seasons (Table 3). For all the proxies the correlation with the measurements was clear during spring, summer and autumn (R = 0.55-0.72), while in winter none of the proxies correlated with the measured monoterpenes concentration (R = -0.11-0.16). The variations of the ratio between the proxy and measurements, V90/10 values, were also clearly higher for winter than for other seasons. Generally, the correlation coefficients were higher for MTproxy,ideal, 15 MTproxy1, MTproxy1,doy, MTproxy,simple than for MTproxy2 and MTproxy3 including all the oxidation processes. Interestingly, for the proxies not including the oxidation by NO3, i.e. MTproxy1, MTproxy1,doy and MTproxy,simple, the highest correlation coefficients (even higher than for MTproxy,ideal) were obtained in spring. However, in autumn the correlation coefficient was clearly highest for MTproxy,ideal, which suggests that including the oxidation by NO3 in the proxy is essential at that time of year. The weak correlation between measurements and all the proxies in wintertime can be due to several reasons. First of all, in winter 20 biogenic emissions of monoterpenes are low  and the concentrations are more affected by anthropogenic emissions which are not described by our proxies. At this time of year, measurement uncertainties are also high because the concentrations are often close to the detection limit of the PTR-MS (Taipale et al., 2008). On the other hand, in winter there are also more uncertainties related to proxies. For example, the boundary layer height is often not well defined in winter (Von Engeln and Teixeira, 2013). In addition, the contribution of NO3 to the oxidation loss of monoterpenes can be expected to be 25 higher in winter than in summer as there is less solar radiation (Peräkylä et al., 2014).

Monthly median concentrations
The monthly median concentrations of the measured monoterpenes and different proxies are shown in Fig. 6a. The measured monoterpene concentration was highest in July (median value 9.4×10 9 cm -3 ) and lowest in February-March (median value 8.2×10 8 cm -3 ). The summer maximum and the winter minimum were captured by all the proxies. However, the ratios between 30 the concentrations predicted by the proxies and the measured monoterpene concentrations varied from month to month (Fig.   6b). In November-January, all the proxies overestimated the monoterpene concentration. MTproxy,simple was closest to the measurements, while MTproxy1 overestimated the concentration most, showing the need for the DOY-dependent correction (see Sect. 2.2.1). In February, though, MTproxy1 was close to the measurements together with MTproxy1,doy and MTproxy,simple, while other proxies predicted too low concentrations. In March-May all the proxies performed adequately, apart from MTproxy,simple overestimating the concentration in March. In midsummer, June-July, the proxy-to-measurement -ratios were close to one for all the proxies. In August all the proxies slightly overestimated the concentration. In September-October, the proxies were generally reasonably close to the measurements; MTproxy,simple and MTproxy1,doy underestimated the monoterpene concentration 5 and other proxies slightly overestimated them. Altogether, the median proxy-to-measurements ratios were between 0.8 and 1.3 in April-October and between 0.6 and 2.6 in November-March. The more detailed statistics of the ratio between the proxies and measured monoterpene concentration in different months are presented in Table A2 in Appendix.
All in all, it seems that the proxies generally predict too high concentrations in winter but are able to produce the correct concentration level relatively well in other seasons. In winter, MTproxy,simple tend to be closest to the measurements while at 10 other times of the year MTproxy,ideal, MTproxy1,doy, MTproxy2, and MTproxy3 performed best. The overestimation of most of the proxies during wintertime may be related to the fact that, in reality, the emission potential of vegetation (described by the coefficient a in our proxies) has a strong seasonal variation (Taipale et al., 2011;Rantala et al., 2015). For MTproxy1 the DOYdependent correction, which can be thought to represent the seasonal variation of the emission potential, improves the seasonal cycle of the proxy as MTproxy1,doy does not overestimate the winter-time concentrations as much as MTproxy,1. The month-to-15 month variation of the proxy to measurement ratios may reflect the uneven distribution of measured data: for some months, there are measurements available only from few years, in which case the variation due to the specific conditions of those years strongly affects the proxy to measurements ratio (see Table 1).

Diurnal cycle
In addition to producing the correct concentration level at different times of years, it is essential that the proxies are able to 20 describe the diurnal variation of monoterpene concentration. The median diurnal cycles of measured monoterpene concentrations and three proxies are illustrated in Fig. 7 for six different months (the rest of the months are shown in Fig. A1 in Appendix).
In March-September, the measured monoterpene concentration had a clear diurnal cycle with the lowest concentrations around noon and the highest concentrations at night or late in the evening. In March, MTproxy1 and MTproxy1,doy captured the diurnal 25 cycle of monoterpene concentration best. MTproxy,simple overestimated the concentration throughout the day, and MTproxy,ideal, MTproxy2 and MTproxy3 predicted a too strong diurnal variation and too high daytime concentrations. In April and May all the proxies were able to produce the diurnal cycle quite well, having the daily maxima and minima around the same time as the measured concentration. In June-August, the measured monoterpene concentration had a very strong diurnal cycle with a minimum around noon. In these months, MTproxy,simple produced a clearly too weak diurnal cycle, while other proxies described 30 the diurnal cycle of monoterpene concentration well. In September, the proxies performed adequately in general, except for MTproxy,simple,, having a too weak diurnal cycle, and MTproxy1,doy predicting too low concentrations. In October-February, the measured monoterpene concentration had a significantly weaker diurnal cycle than in summer, and the highest concentrations were generally reached during daytime. In these months MTproxy,ideal, MTproxy2 and MTproxy3 produced a too strong diurnal variation and clearly overestimated the concentration during daytime. MTproxy1 also overestimated the concentration, while the concentrations predicted by MTproxy,simple and MTproxy1,doy were closest to the measurements.
Altogether, it seems that the proxies including all oxidation mechanisms (i.e. MTproxy,ideal, MTproxy2 and MTproxy3) were able to describe the diurnal variation of monoterpene concentration well in summer when the diurnal cycle of the concentration was 5 strong. The simpler proxies, especially MTproxy,simple, were not able to capture the diurnal cycle as well at this time of year. On the other hand, in winter months, when the diurnal cycle was weaker, the simpler proxies produced the diurnal cycle best. The fact that the proxies were not able to produce the diurnal cycle accurately in winter is understandable, as at that time of year the biogenic emissions of monoterpenes are low (see also the discussion in the end of the Sect. 3.1.1). Furthermore, in winter the relative role of NO3 becomes higher (Peräkylä et al. 2014), as there is less solar radiation, and therefore the uncertainties 10 related to calculating its concentration affect the proxies more than in summer. The boundary layer height, used in the proxies to describe the dilution of monoterpene concentration, is also not as well defined in winter as in summer (Von Engeln and Teixeira, 2013).

Monoterpene oxidation products
The concentration of monoterpene oxidation products (OxOrg) in Hyytiälä was calculated for the years 1996-2014 (Fig. 8). 15 MTproxy1,doy was used for the monoterpene concentration in the calculation, as it was observed to produce both the seasonal and diurnal cycle of monoterpene concentration reasonably well. In this section the seasonal and diurnal variations of the calculated monoterpene oxidation products are discussed. Figure 9 presents the monthly medians of the total concentration of monoterpene oxidation products and the contributions of 20 different oxidants (O3, OH and NO3) to the total concentration during 1996-2014. The total concentration of oxidation products had a distinct seasonal cycle: the median concentrations were highest, 1.9-2.4×10 8 cm -3 , in summer (June-August) and lowest, 3.4-5.4×10 7 cm -3 , in winter and early spring (January-March). Thus, the seasonal cycle of the oxidation products resembled the seasonal cycle of MTproxy1,doy (see Fig 6a). The summertime peak in the total concentration of oxidation products was caused by the oxidation products of O3, which had a pronounced maximum in July and a minimum in February. The 25 concentration of the oxidation products of NO3, on the other hand, was lowest in spring (February-May) and highest in autumn and winter (October-January). In October-March the median concentrations of oxidation products of NO3 were even higher than the median concentrations of oxidation products of O3. The oxidation products of OH had a clear seasonal cycle with the maximum in July and a minimum in winter, following the seasonal cycle of solar radiation. In summer months the median concentrations of oxidation products of OH were similar to the median concentrations of oxidation products of NO3, both of 30 them being clearly lower than the median concentrations of oxidation products of O3. Thus, our proxy for the oxidation products of monoterpenes seems to be dominated by the oxidation of monoterpenes by O3 in summer, while in winter the oxidation by NO3 is most significant.

Diurnal variation
In Fig. 10 the diurnal cycle of the concentration of monoterpene oxidation products (the total and the contributions of different oxidants) is illustrated for different seasons. In all seasons the total concentration of oxidation products was highest in the 5 evening and lowest in the morning or around noon. The diurnal cycle was mostly determined by the diurnal variation in the oxidation products of NO3 in all seasons except summer (June-August). In March-May the total concentration of oxidation products stayed quite stable during daytime (from 6:00 to 18:00) and was at that time dominated by the oxidation products of O3. The concentration had a pronounced peak in the evening around 21:00, which was caused by the maximum in the concentration of the oxidation products of NO3. In June-August the total concentration of oxidation products was lowest in 10 the morning around 5:00, after which the concentration increased reaching its maximum around 21:00. The evening peak was mainly due to the maximum in the oxidation products of O3, which dominated the total concentration throughout the day. On the other hand, at this time of year, the contribution of OH was also significant during daytime. In September-November the evening peak in the total concentration of oxidation products occurred earlier, around 18:00. It was primarily caused by the maximum in the concentration of the oxidation products of NO3. During daytime the total concentration of oxidation products 15 was dominated by the oxidation products of O3. In December-February the total concentration of oxidation products followed the oxidation products of NO3; the concentration was lowest during daytime and highest at night. In all seasons, except winter, the oxidation products of OH had a pronounced maximum around noon. At that time the concentration of the oxidation products of OH generally exceeded the concentration of the oxidation products of NO3, being still lower than the concentration of the oxidation products of O3. In winter, when there is only little solar radiation, the concentration of oxidation products of 20 OH was very low throughout the day.

Conclusions
The oxidation products of monoterpenes likely have an important role in the formation and growth of aerosol particles in boreal forests (Kulmala et al., 1998;O'Dowd et al., 2002;Ehn et al., 2014;Jokinen et al., 2015). Therefore, the improved understanding of their concentration is needed, for example, when determining the climate effects of aerosol particles. In this 25 study, we developed proxies for estimating the concentrations of monoterpenes and their oxidation products at a boreal forest site in Hyytiälä, southern Finland. For deriving and testing the validity of the proxies, we used monoterpene concentration measured in Hyytiälä during 2006-2013.
Our proxies for the monoterpene concentration include the temperature-driven emissions of monoterpenes, the dilution of the concentration caused by the mixing within the boundary layer, and the oxidation of monoterpenes by different oxidants (OH,30 O3,and NO3). Due to the difficulties related to estimating the concentration of NO3, we tested five different versions of the proxy: 1) a proxy where the oxidation of monoterpenes by NO3 is neglected, 2) a proxy where the oxidation of monoterpenes by NO3 is neglected and an additional DOY-dependent correction is applied 3) a proxy where NO3 concentration is estimated by using a constant value for monoterpene concentration, 4) a proxy where NO3 concentration is calculated iteratively by using another monoterpene proxy, and 5) a proxy where all the oxidation processes are neglected.
All versions of the proxies for monoterpene concentration correlated well with the measured concentration (R = 0.65-0.74), 5 and thus captured the seasonal variation of the monoterpene concentration. The best correlation with the measurements was obtained for the proxies not including the oxidation by NO3. This suggests that estimating NO3 concentration causes too much uncertainty to improve the performance of proxies, thus demonstrating the need for direct measurements of NO3 concentration with the low enough detection limit. When investigating the ratios of the measured monoterpene concentration and the proxies, the proxies were mostly found to predict the correct concentration level in summer but overestimate the concentration in 10 winter. The typical proxy-to-measurements ratios were between 0.8 and 1.3 in summer and between 0.6 and 2.6 in winter. In addition, the proxies were observed to describe the diurnal variation of the monoterpene concentration reasonably well in summer but rather poorly in winter. Generally, the proxies including all the oxidation processes were able to produce the diurnal cycle of the monoterpene concentration in summer months when the measured concentration had a strong diurnal variation. However, in winter, when the diurnal cycle of the measured concentration was weak, the simpler proxies were closer 15 to the measurements. Altogether, the proxy neglecting the oxidation of monoterpenes by NO3 and including a DOY-dependent correction (MTproxy1,doy, see Eq. (12)) was found to describe the variation of monoterpene concentration most accurately. Therefore, we recommend using this proxy for predicting the monoterpene concentration in Hyytiälä and at similar, remote boreal forest sites.
To investigate the diurnal and seasonal variation of the oxidation products of monoterpenes in Hyytiälä, we calculated their 20 concentration during 1996-2014 by using the most accurate monoterpene proxy. The oxidation products of monoterpenes had a clear seasonal cycle with the highest concentration in summer and the lowest concentration in winter. When studying the diurnal variation of the oxidation products, the concentration was found to be highest in the evening and lowest in the morning or around noon. The evening maximum was mainly caused by the oxidation products of O3 in summer, and by the oxidation products of NO3 in other seasons. The contribution of the oxidation products of OH to the total concentration of oxidation 25 products was highest in summer during daytime and minor in winter.
In the future, our proxies for the concentrations of monoterpenes and their oxidation products can be utilized, for example, when investigating the formation and growth of aerosol particles in Hyytiälä. The proxies could possibly be applied also at other measurement sites, at least those located in a boreal forest, but this remains to be tested in future studies. In addition, further work is needed to validate the performance of the proxy for the monoterpene oxidation products by using the direct 30 measurements of oxidized organic compounds. Rinne, J., Bäck, J., and Hakola, H.: Biogenic volatile organic compound emissions from the Eurasian taiga: current knowledge and future directions, Boreal Environ. Res., 14, 807-826, 2009. Taipale, R., Ruuskanen, T. M., Rinne, J., Kajos, M. K., Hakola, H., Pohja, T., and Kulmala, M.: Technical Note: Quantitative long-term measurements of VOC concentrations by PTR-MSmeasurement, calibration, and volume mixing ratio calculation methods, Atmos. Chem. Phys., 8, 6681-6698, doi:10.5194/acp-8-6681-2008, 2008 Taipale, R., Kajos, M. K., Patokoski, J., Rantala, P., Ruuskanen, T. M., and Rinne, J.: Role of de novo biosynthesis in ecosystem scale monoterpene emissions from a boreal Scots pine forest, Biogeosciences, 8, 2247-2255, doi:10.5194/bg-8-2247 Temperature and light dependence of the VOC emissions of Scots pine, Atmos. Chem. Phys., 5, 989-998, doi:10.5194/acp-5-989-2005, 2005. 10         5 Table A2. The statistics of the ratio between the proxies and the measured monoterpene concentration in different months. The 10th, 25th, 50th, 75th, and 90th percentiles of the ratio are shown.  Figure A1. Median diurnal variation of the measured monoterpenes and different proxies in different months (other months are shown in Fig. 7). The black circles show the measured concentration, the grey squares MTproxy,ideal, the light blue squares MTproxy1, the dark blue squares MTproxy1,doy, the red squares MTproxy2, the green squares MTproxy3, and the magenta squares MTproxy, simple.