Local short-term variability in solar irradiance

Characterizing spatiotemporal irradiance variability is important for the successful grid integration of increasing numbers of photovoltaic (PV) power systems. Using 1 Hz data recorded by as many as 99 pyranometers during the HD(CP)2 Observational Prototype Experiment (HOPE), we analyze field variability of clear-sky index k (i.e., irradiance normalized to clear-sky conditions) and sub-minute k increments (i.e., changes over specified intervals of time) for distances between tens of meters and about 10 km. By means of a simple classification scheme based on k statistics, we identify overcast, clear, and mixed sky conditions, and demonstrate that the last of these is the most potentially problematic in terms of short-term PV power fluctuations. Under mixed conditions, the probability of relatively strong k increments of ±0.5 is approximately twice as high compared to increment statistics computed without conditioning by sky type. Additionally, spatial autocorrelation structures of k increment fields differ considerably between sky types. While the profiles for overcast and clear skies mostly resemble the predictions of a simple model published by Hoff and Perez (2012), this is not the case for mixed conditions. As a proxy for the smoothing effects of distributed PV, we finally show that spatial averaging mitigates variability in k less effectively than variability in k increments, for a spatial sensor density of 2 km−2.


Introduction
The number of photovoltaic (PV) power systems has drastically increased in many regions of the world during the last decade, reaching a total nominal capacity of more than 178 GW installed worldwide at the end of 2014.The future global PV capacity is expected to continually increase fur-ther, with predictions for 2019 ranging from 396 to 540 GW (SPE, 2015).In consequence, the challenges associated with the inherent volatility of PV power production and its fundamental cause, weather-induced heterogeneity in solar irradiance fields, will considerably increase as well (Stetz et al., 2015).Variability in both irradiance and irradiance increments (changes over specified intervals of time) are of interest in this context.On the one hand, variability in irradiance itself primarily affects the yield of a PV system and the dimensioning of battery storage.On the other hand, variability in irradiance increments impacts the balancing of generation and load, as well as the maintenance of power quality such as voltage and frequency stability.Depending on the dimensions of the power grid and the PV capacity in question, relevant variability in irradiance and its increments can span a broad range of spatiotemporal scales, from seconds and meters up to days and hundreds of kilometers.There is a need to understand the biases in representation of temporal variability resulting from temporally coarse-resolution observations (Yordanov et al., 2013b), as well as how spatial averaging (as would come from having distributed PV over a region) mitigates variability (Hoff and Perez, 2010).Characterizing the spatiotemporal volatility of irradiance fields and their increments is key to the planning and reliable operation of future power grids and their corresponding subsystems.
Recent studies of PV-related variability have analyzed power spectra of PV systems and solar irradiance (Calif et al., 2013;Curtright and Apt, 2008;Klima and Apt, 2015;Lave and Kleissl, 2010;Marcos et al., 2011a;Tabar et al., 2014;Yordanov et al., 2013b), compared power fluctuations from specific PV plants with corresponding irradiance measurements (Lave and Kleissl, 2013;Lave et al., 2013;Marcos et al., 2011a;van Haaren et al., 2014), and characterized power variability as a function of PV plant Published by Copernicus Publications on behalf of the European Geosciences Union.
size (Dyreson et al., 2014;Lave et al., 2012;Marcos et al., 2011b;van Haaren et al., 2014).Furthermore, spatial autocorrelation structures and decorrelation length scales of increments in irradiance and clear-sky index (i.e., irradiance normalized to clear-sky conditions), and PV power output have also been studied for a range of spatial scales and increment values (Arias-Castro et al., 2014;Elsinga and van Sark, 2014;Hinkelman, 2013;Hoff and Perez, 2012;Lave and Kleissl, 2013;Mills, 2010;Perez et al., 2012;Perpiñán et al., 2013).For all quantities and methods considered, increment correlations at different locations have been shown to decrease with increasing distance, with a smaller rate of decrease (longer decorrelation distances) for larger time increments.
While satellite-derived irradiance data are convenient for the analysis of large spatiotemporal scales, comprehensive data sets for local short-term variability are time consuming and expensive to collect.They are needed but have not previously been available (Hoff and Perez, 2012;Perez et al., 2012).Therefore, previous studies are either restricted to a limited spatial resolution, a limited temporal resolution, or both.For example, the few studies that have been based on high-resolution 1 Hz PV power observations only had measurements from a maximum of six PV systems (Marcos et al., 2011a, b) at their disposal.Though irradiance measurements with this high temporal resolution have on occasion been conducted with more sensors, the spatial coverage remains strongly confined (e.g., up to 45 sensors spread across ∼ 2.5 km 2 ; Dyreson et al., 2014).Some studies have used artificially generated data to overcome these restrictions, either by simulating simple cloud shapes (Arias-Castro et al., 2014;Lave and Kleissl, 2013), or by constructing virtual networks based on time-shifted single-sensor measurements (Perez et al., 2012).However, these simulated data sets do not necessarily coincide with reality.
To fill the gap in understanding of small-scale spatial and temporal variability in irradiance, we use an extensive experimental data set of global horizontal irradiance (GHI) field samples from two measurement campaigns to characterize sub-minute variability of clear-sky index for distances between tens of meters and about 10 km.A high temporal resolution of 1 Hz and the use of up to 99 synchronized silicon photodiode pyranometers yields a robust basis for the analyses (these data are described in detail in Sect.2).Based on this data set with its unprecedentedly fine resolution in both space and time, we study single-point statistics and twopoint correlation coefficients of clear-sky index, and develop a simple classification scheme to identify overcast, clear, and mixed skies (Sect.3).Conditioned on these three sky types, we then analyze the probability distributions of sub-minute increments in clear-sky index for single sensors and large spatial averages of about 80 km 2 , as well as their corresponding spatial autocorrelation structures (Sect.4).Finally, we spatially average randomly selected sensors from the data set, covering different area sizes but maintaining a fixed spatial density, as a proxy for the smoothing effects of distributed PV power production (Sect.5).Discussions and conclusions follow in Sects.6 and 7.

Measurement campaigns
The data sets on which this study's analyses are based originate from two extensive measurement campaigns performed during the HD(CP) 2 Observational Prototype Experiment (HOPE) using a set of autonomous silicon photodiode pyranometers.These instruments measure the downwelling shortwave radiation at the Earth's surface in the range between 0.3 and 1.1 µm.Although this wavelength band does not span the entire solar irradiance spectrum, it corresponds well with the relevant bandwidths of typical semiconductor materials used for photovoltaic applications (Pérez-López et al., 2007).In fact, the pyranometers themselves may essentially be thought of as tiny PV systems, reduced in space to a single point.Equipped with a battery power supply for up to 10 days, they store their data onsite with a temporal resolution of 10 Hz (averaged to 1 Hz during postprocessing).A GPS-based clocked control is used to ensure synchronization between sensors, and for proper positioning data.
The first field campaign with these instruments took place near Jülich, Germany (50.9 • N, 6.4 • E), from 2 April to 24 July 2013.It featured a total of 99 pyranometers deployed over an area of about 80 km 2 .The second was conducted near Melpitz, Germany (51.5 • N, 12.9 • E), and lasted from 3 September until 14 October 2013.During this time, 50 pyranometers, all of which had previously been used in the Jülich campaign, were deployed over an area of about 4 km 2 .During the measurement campaigns each instrument was subject to regular weekly maintenance.This maintenance included data transfer, battery replacement, thorough cleaning of the glass dome, and re-leveling of the mounting platform (if necessary).As part of this process, the states of cleanliness and orientation were recorded, in order to facilitate identification of periods of bad data.For each observation of tilt or fouling, that week's worth of data was flagged accordingly, even though the specific problem did not necessarily last for the entire preceding week.
The geometry of the pyranometer locations for the Melpitz and Jülich campaigns, as well as a histogram of all sensor pair distances d ij is presented in Fig. 1.The sensor layout of the Melpitz campaign, with many sensors concentrated in the center and fewer towards the edge of the domain, is more structured than that of the Jülich campaign.This difference is due to the much larger spatial domain in the Jülich case, which entailed external restrictions on the instrument locations, such as road access, setup permission, and agricultural land use.Consequently, most of the very short sensor pair distances (d ij < 1 km), and some intermediate distances (1 km < d ij < 2 km) are attributed to the Melpitz campaign, while sensor pair counts with d ij > 3 km are all associated with the Jülich campaign (Fig. 1c).
Taking into account the final data sets' high temporal resolution of 1 Hz, along with the corresponding dense spatial coverages, the two field campaigns provide the basis for unique analyses of irradiance variability, particularly regarding potential PV power fluctuations.At the same time, the limited durations of the campaigns result in a data set that extends from mid-spring to mid-autumn, and may not be representative of other times of the year.Schmidt et al. (2016) use data from the Jülich campaign for a performance evaluation of sky-imager-based solar irradiance forecasts, and Madhavan et al. (2016) present a more detailed discussion of the campaign and the instrumentation.To the best of the authors' knowledge, no other PV-related studies based on comparably dense and high-frequency irradiance sensor networks have been published to date.

Clear-sky index
The available global horizontal irradiance (GHI) at any given point on the Earth's surface is subject to influences from both astronomical and atmospheric processes.As for the former, the apparent movement of the Sun relative to Earth gives rise to diurnal and seasonal variations in GHI.These variations are accurately predictable and not large on short timescales of seconds or minutes.On the other hand, weather-related contributions to irradiance variability are manifold and complex, and present on all timescales.For instance, the growth, motion, and decay of clouds can affect the seasonal cycle in GHI (e.g., winter tends to be cloudier than summer in midlatitude low-lying land), and the rapid succession of sunlight exposure and cloud shadow in conditions dominated by fairweather cumulus generates stochastic variability on short timescales (seconds-minutes) (Woyte et al., 2007).The presence of different kinds of cloud (e.g., layer vs. convective) at different altitudes and of different composition (e.g., highalbedo small cloud droplets vs. low-albedo large droplets in rain clouds) result in a complex set of influences on GHI over a broad range of timescales.
In order to distinguish the cloud-induced fluctuations from the slowly evolving, astronomically determined apparent motion of the Sun, a GHI time series G at a particular location may be related to either the extraterrestrial solar radiation G extra (i.e., irradiance on Earth if there were no atmosphere), or the clear-sky radiation G clear (i.e., irradiance on Earth with a cloud-free atmosphere).Knowing these, we can define the clearness index and clear-sky index respectively.The extraterrestrial solar radiation G extra depends only on astronomical relationships, whereas a calculation of clear-sky radiation G clear requires parameters of atmospheric conditions, such as typical water vapor concentration and aerosol load.As the exact characteristics of G clear are model dependent, there is no single unique clear-sky index.The use of any clear-sky model thus introduces new uncertainties to the k * time series which are absent from the original irradiance measurements.In spite of these uncertainties, use of the clear-sky index is convenient because of the pronounced non-stationarity of the irradiance time series.While all atmospheric influences on GHI variability are included in k, variations in the clear-sky index are dominated by changes in cloud cover.Other sub-daily variations, especially those caused by changes in light scattering with solar angle α, are ideally removed entirely from k * (although depending on how accurately the atmospheric conditions and their variability are actually estimated, such changes with α may still affect the clear-sky index to a minor degree).With k * thus being better suited to remove trends in GHI variability, the focus of this analysis will be clear-sky irradiance time series computed for the respective locations of both field campaigns, using the clear-sky model described by Fontoynont et al. (1998).A limitation of this model is that it is based on climatological means and does not account for all variations in scattering or absorption properties.Moreover, relatively low values of GHI occurring shortly after sunrise and just before sunset, coupled with path prolongation and corresponding higher uncertainties in clear-sky ir-radiance calculations at these times, can result in unrealistic values of k * (Lave et al., 2012).In consequence, we only consider data associated with α > 15 • throughout the study.While the resulting clear-sky index remains an approximate model-based quantity, rather than a direct measurement, it allows us to focus directly on weather-related variations in surface irradiance.
The lowest values of k * are typically not zero, because even the darkest of clouds do not attenuate all irradiance.Additionally, the upper limit can exceed 1, primarily due to short-term reflections from the sides of clouds (and also to a secondary degree due to the limitations of the clear-sky model).Under broken cloud conditions this phenomenon, known as cloud enhancement, can cause single-point GHI to exceed its corresponding clear-sky irradiance value by more than 50 % on short timescales (Luoma et al., 2012;Piacentini et al., 2011;Yordanov et al., 2013a).
To characterize the modulation of k * variability by the prevailing sky type (e.g., overcast vs. clear sky), we divide the time series at each sensor into non-overlapping 15 min windows.This sub-hourly timescale is short enough that it is typically dominated by a single sky type, but long enough that there is enough variability to make statistical analyses meaningful.We will use differences in the statistics of k * within these 15 min windows to define different sky type categories.
To illustrate the wide range of cloud influences on k * statistics in these 15 min windows, Fig. 2 presents three distinct examples of spatiotemporal variability in k * .These representative subsets have been manually selected from a pool of random windows sampled from the entire duration of the Jülich campaign.Each panel includes summary statistics for all sensors in the domain for the period (represented as box plots), as well as the variability of a single randomly selected sensor.The box plots each consist of a lower "whisker" w low , the first quartile Q 1 , the median Q 2 , the third quartile Q 3 , and an upper "whisker" w up (summarized in Table 1).Following common practice when presenting box plots, w low (w up ) is defined as the lowest (highest) data point that still falls within the range of (Devore, 2015).Any data below w low or above w up are considered outliers.
The 15 min window in Fig. 2a features very little spatial variability and a continually low range of k * values, corresponding to a time of overcast conditions during which a fairly homogeneous cloud layer spanned the entire domain.In contrast, the majority of sensors in Fig. 2b show a continually high range of k * with little spatial variability, with the exception of some pronounced rapid and short-lived decreases in k * .Clear-sky conditions dominated the domain at this time, with occasional short-duration shadows cast on single sensors (although not the single example sensor).Finally, the data shown in Fig. 2c display considerable variability throughout the domain at all times, with a consistent IQR value of ∼ 0.5.The trace of the example sensor clearly  1).
illustrates the predominant condition of mixed skies in this case, with an alternation between cloud coverage and clearsky exposure.The characteristic differences in the temporal average and variability of k * evident in these example data sets indicate that a natural classification scheme for sky type within 15 min can be developed in terms of the statistics of k * .
3 Sky type variability

Single-point statistics of clear-sky index
In order to assess the character of irradiance variability conditioned on sky type, we group subsets of similar sky conditions by means of two simple statistics.Specifically, we compute the sample arithmetic mean Table 1.Summary statistics used to visualize data spread throughout this study.

Name Symbol Definition
First quartile and the sample standard deviation of k * i (the clear-sky index of the ith sensor) for each sensor for all 15 min periods, using non-overlapping windows of width T = 900 s (resulting in a sample number N = 900 for the 1 Hz data).
These two statistical measures allow an intuitive characterization of the prevailing sky type that a sensor has been subjected to for a limited time, by quantifying the average and spread of the respective 15 min window in its time series.A kernel density estimate (KDE) of the joint probability density function (PDF) of k * i and σ k * i using data for all individual sensors and all available days from both measurement campaigns is presented in Fig. 3.The estimated PDF is overlaid with a regular grid that we will use to define particular sky type classes.
In the low variability range σ k * i < 0.09, the two peaks in the joint PDF (located in A1/B1, and D1/E1 of Fig. 3, respectively) clearly represent the cases of predominant overcast (low k * i and low σ k * i ), and clear-sky (high k * i and low σ k * i ) conditions.In addition to these two well-defined end members, intermediate sky conditions were also recorded by the single sensors.While these span the entire ranges of k * i and σ k * i , they are not uniformly distributed.In fact, a separation between relatively clear-sky and overcast conditions (i.e., high and low k * i ) can be observed for most values of σ k * i , though the distinction becomes less pronounced with increasing σ k * i .

Two-point correlations of clear-sky index
The quantities k * i and σ k * i provide single-point temporal statistical information about variability in k * .As a first characterization of the spatial structure of the k * field, we compute the spatial two-point correlation coefficients between sensors i and j (5) conditioned on the 15 min intervals falling within individual boxes in Fig. 3.That is, all 15 min time windows, during which both sensors in a pair are simultaneously associated with the same sky type, are used to calculate a single ρ k * ij value for the respective pair of sensors.In Eq. ( 5), k * i (t) and k * j (t) are the two individual time series of the pair of sensors for those time periods in which the statistics of both sensors are in the same grid box, while k * i and k * j are the corresponding arithmetic means.The quantity N denotes the number of data points in the two time series.Note that the meaning of the overbar, denoting the arithmetic mean, is different than that in Eqs. ( 3) and (4), as the averaging time increases from 900 s in Eq. (3) to the total time of simultaneously available 1 Hz data of i and j in Eq. ( 5).
The resulting distributions of spatial autocorrelation functions ρ k * ij are shown as functions of sensor pair distance d ij in Fig. 4 for each of the grid boxes from Fig. 3.The same box plot statistics listed in Table 1 are computed for 10 logarithmically spaced bins of d ij .A pair of sensors is only included in these calculations if its members share the same sky type for at least 60 min over the observational period.The number of pairs used to derive the box plot information is given in each panel.
Pairs of sensors with very high σ k * i and very low k * i (panels A3 through A5), are found to be virtually non-existent, although these ranges are occupied by individual sensors (cf.Fig. 3).Similarly, combinations with relatively high σ k * i and moderate or large k * i (panels B4/5 and E4/5) also lack a high number of available sensor pairs.The remaining wellsampled grid boxes all show spatial autocorrelation functions ρ k * ij that decrease with increasing d ij , as expected.However, the rates of decrease vary considerably across the different grid boxes.The differences in autocorrelation structure between two adjacent grid boxes (e.g., A1 and B1) are generally small, but become more pronounced when comparing those farther apart (e.g., A1 and D4).
For further analyses, and consistent with the manually selected exemplary periods previously shown in Fig. 2, we consider a classification of three distinct sky types based on the grid boxes: 1. overcast (A1 and B1), 2. clear (D1 and E1), and

mixed (A3 through E5).
This classification is based upon the subjective identification of different grid boxes in Figs. 3 and 4 with characteristic statistical properties.Although some data corresponding to intermediate sky conditions (panels A2 through E2, as well as C1) are neglected using this classification scheme, the structure of ρ k * ij is appreciably distinct for all three identified sky types.
Under overcast conditions, correlation coefficients remain ρ k * ij ≈ 1 for d ij 1 km, while clear conditions deviate from ρ k * ij ≈ 1 even for relatively small separations d ij 0.05 km.Correlation values ρ k * ij ≈ 1 appear under mixed sky conditions only for very small d ij 0.05 km.With increasing distances, the three sky types' spatial autocorrelation structures also differ in their rates of decay.For example, the characteristic distance to reach ρ k * ij ≈ 0.5 is about 10 km for clear-sky and overcast conditions, while it is an order of magnitude smaller (∼ 1 km) for the mixed sky type.
The k * autocorrelation structures within the different sky types are consistent with the associated cloud patterns.The results for overcast and clear-sky conditions both suggest fairly large and homogeneous structures (cf. 10 km to reach ρ k * ij ≈ 0.5), corresponding to large stratus-type cloud layers in the former case, and homogeneously clear skies (with infrequent and localized shadowing of the sensors) in the latter.
During times classified as mixed, the structure of ρ k * ij indicates that heterogeneous cloud fields dominate, with much smaller length scales than those under overcast conditions.The decay length scale of correlations under mixed skies corresponds well with typical cloud length scales 2 km of cumulus-type clouds (Neggers et al., 2003).

Variability in clear-sky index increments
The previously discussed properties of the observed k * fields are independent of their ordering in time, i.e., randomly shuffling all sensor-pair data in time (within the 15 min windows) will result in the same spatial autocorrelation structures.While this overall variability is of some interest (e.g., when considering long-term yield of PV systems), it does not characterize how rapidly k * fields can change.A useful measure of intermittency in the clear-sky index is the statistics of for different time lags τ .

Increment statistics
PDFs of k * τ estimated from data from the Jülich campaign for three short-term time lags τ = (1, 10, 60) s are presented in Fig. 5.The results are first conditioned on the sky type classification (first three columns) and then shown again for all sky types together (rightmost column).As a first illustration of the effect of spatial averaging on fluctuation statistics, the PDF of area-averaged increments is also included in each panel.
All PDFs are characterized by a narrow central peak -corresponding to a high probability of very small incrementssurrounded by broad tails in which the PDF decreases slowly.With increasing τ , these tails become flatter and the probability of very large excursions increases.While the PDFs of single-sensor observations all exhibit such tails, these features are much less prominent for the spatially averaged k * increment distributions.This result is consistent with previous findings of e.g., Lave and Kleissl (2010), Marcos et al. (2011b), Dyreson et al. (2014), andvan Haaren et al. (2014).Comparing the distributions of increments from multiple pyranometers and PV plants of different capacities for various temporal resolutions, these previous studies all showed high magnitude changes with increasing time lag, and fewer high magnitude changes with increasing PV plant size (or numbers of sensors).
Under overcast conditions, the central peaks of the distributions are generally prominent and the tails are not particularly pronounced.The PDFs of clear skies also have a strong central peak but display broad flat tails (higher than for over-  3, based on data from both field campaigns.The summary statistics (cf.Table 1) are based on 10 logarithmically scaled bins of d ij , and include only those sensor pairs whose members simultaneously correspond to the same grid box for at least 60 min.The total number of pairs used to derive the statistics is given in each panel.Grid boxes to be subsequently grouped as similar sky types are indicated by colored boxes.cast conditions), representing the rare large excursions evident in Fig. 2. The central peak is wider under mixed conditions, and the flanks are flatter than for clear skies.Under mixed conditions, probabilities of large excursions are enhanced by the rapid changes associated with passing cloud edges.This distribution of increments is consistent with the individual time series shown in Fig. 2. The statistics obtained when all sky types are taken together feature the same shapes in the tails as those of the mixed sky type, because the extreme fluctuations in k * are most common under mixed conditions.
A measure of the extent of the tails of the PDF is the probability P( k * τ = ±0.5) of a single sensor to fluctuate by ±0.5.The values for this quantity, quoted in each panel of Fig. 5, take different orders of magnitude among the different sky types.These probabilities increase from overcast to clear and then to mixed sky conditions, while the val-ues associated with the overall statistics of all sky types are located somewhere in between the last two classifications.Compared to the statistics of all sky types, and independent of τ , P( k * τ = ±0.5) is more than twice as high under conditions classified as mixed.Thus, for applications such as the maintenance of grid stability, where worst case scenarios (in terms of strong short-term PV fluctuations) are of interest, the conditioning of k * τ statistics on different sky types demonstrates the strong dependence on specific sky conditions of the likelihood of severe fluctuations occurring shortly one after another.
While changes in increment variability are reflected by changes of σ k * τ (dashed lines in Fig. 5), the standard deviation is not an appropriate measure of the size of extreme fluctuations due to the non-Gaussian character of k * increment statistics.In consequence, the widely used three-sigma rule of thumb, according to which a range of k * τ ± 3σ k * τ were normally distributed (Nikulin, 2002), can be misleading when applied to k * fluctuations.For example, only about 95 % of the empirical single-sensor data are included in this range for the most variable subset of τ = 60 s and mixed skies (Fig. 5k).This result is in line with previous findings of the 99.7th percentile of 1 min increments in clear-sky index being about 7 standard deviations away from the mean (Mills, 2010).

Two-point correlations of increments
The differences between the single-sensor increment statistics and the distributions of areal averages in Fig. 5 are of interest, because the spatial averaging clearly results in a substantial reduction in the probability of high-magnitude k * fluctuations.In order to specify the underlying spatiotemporal field characteristics, we analyze two-point correlation coefficients of k * increments ρ k * τ ij (computed as in Eq. 5), based on data from both field campaigns.Various models have previously been proposed to predict the behavior of increment correlation as a function of distance for specific temporal scales, either by using empirical fits to measured data (Hoff and Perez, 2012;Lave and Kleissl, 2013;Lonij et al., 2013;Mills, 2010;Perpiñán et al., 2013) or by modeling simplified cloud shapes (Arias-Castro et al., 2014).While the empirical fits are based on data sets of limited spatiotemporal resolutions, the more theoretical model does not account for the complexity encountered in real cloud shapes.Our data set permits a direct empirical assessment of the spatial correlation structure of increments.
Conditioned on the previously defined classification scheme of sky types, summary statistics of ρ k * τ ij (cf.Table 1) are presented as functions of sensor pair distance for different time lags in Fig. 6, using 10 logarithmically scaled d ij bins.Also shown in Fig. 6 are the autocorrelation structures as functions of sensor pair distance divided by increment size, obtained using k * τ time series of all three increments τ = (1, 10, 60) s.The plots only include those sensor pairs whose members simultaneously correspond to the same sky type for a total of at least 60 min.
A useful measure of increment correlation structure is the decorrelation length scale d 0 , which we define to be the minimum distance at which ρ k * τ ij = 0.05.These distances generally increase with increasing time lags, and they decrease from overcast to clear skies, and then to mixed conditions.As an exception to the latter statement, under overcast and clear conditions k * increments associated with very short time lags τ = 1 s (panels a and b) are uncorrelated even for very small d ij .The mixed sky type features a measurable decorrelation τ as functions of sensor pair distance d ij for different time lags τ (a-i), as well as sensor pair distance divided by time lag (j-l).The results are conditioned on different sky types and based on data from both field campaigns.The summary statistics (cf.Table 1) are computed with 10 logarithmically scaled d ij bins, and include only those sensor pairs whose members simultaneously correspond to the same sky type for a total of at least 60 min.The colored regions correspond to the model (Hoff and Perez, 2012), using a range of relative cloud speeds 2 m s −1 < v < 10 m s −1 .Note the different scales on the vertical axes.length scale d 0 ≈ 90 m for τ = 1 s.For τ = 10 s (τ = 60 s), this distance increases to 2.2 km (7.2 km) under overcast, 0.2 km (1.5 km) under clear-sky, and 0.2 km (1.1 km) under mixed sky conditions.These decorrelation distances estimated from Fig. 6 are all summarized in Table 2.
In accord with previous studies published by Mills (2010) and Hoff and Perez (2012), increasing time lags are accompanied with increasing correlation coefficients for any given decreases with increasing d ij for any fixed τ , converging to zero.Both of these earlier studies analyzed lags on distances between tens and hundreds of kilometers.The Mills (2010) analysis uses 1 min averages of k * from 23 pyranometers, while Hoff and Perez (2012) derive their results from hourly satellite data.As anticipated by Perez et al. (2012), the presence of negative cor-relation extrema found during their single-sensor virtual network study are not observed in our analysis of observationally based ρ k * τ ij .The transitions of the spatial autocorrelation structures of k * τ between sky types and timescales shown in Fig. 6 in part resemble those presented by Perpiñán et al. (2013), who analyzed the relationship between correlation, distance, and timescale of PV power fluctuations, using data from 70 DC/AC inverters in a multi-megawatt PV plant.
In addition to the statistics of the measured data, each panel of Fig. 6 also includes colored regions corresponding to the model of spatial correlation coefficients proposed by Hoff and Perez (2012), using a range of cloud speeds 2 m s −1 < v < 10 m s −1 .These speed values correspond approximately to the range of mean vector winds at 850 hPa v 850 for Germany from the ERA-40 re-analysis atlas (2 m s −1 v 850 8 m s −1 ; Uppala et al., 2005;ECMWF, 2015).While Hoff and Perez (2012) obtained Eq. ( 7) as a curve fit from data with hourly resolution and pair distances up to several hundred kilometers, they also considered an initial assessment of its applicability to higher spatiotemporal resolutions.Compared to other models, Lave and Kleissl (2013) showed Eq. ( 7) to yield relatively high correlation coefficients and relatively long decorrelation length scales.Of the models considered by Lave and Kleissl (2013), the selected one provides the best fit to our data.As the model makes no distinction between sky types, its output in Fig. 6 is the same for each row of panels and the quantity v cannot be interpreted as the actual speed of the clouds.Rather, it represents a quantity with the units of speed that combines information about cloud type and motion.
The ranges of the model results are in broad agreement with the summary statistics of the two field campaigns and the general decrease of spatial correlation with increasing distance is reproduced well.However, differences between overcast and clear-sky conditions are evident, as the former tends to coincide with the upper region of the model range (corresponding to high v), while the latter rather agrees with the lower end of the range (low v).Overcast conditions for τ = 1 s again are an exception to this general rule.For the mixed sky type, and for all values of τ , the correlation values coincide with the upper region of the model for very small d ij , but they decrease more rapidly with increasing distances and thus feature decorrelation distances that are consistently much smaller than the modeled ones.
Finally, the bottom panels (Fig. 6j-l) bring the above results together by presenting the correlation coefficients of different sky types as functions of distance over time lag.Again, overcast conditions coincide with model outputs for high values of v almost entirely, while results for clear skies mostly overlap with modeled correlation coefficients for low v. Mixed conditions are highly correlated for short scaled distances (corresponding to the upper region of the model), Table 2. Increment decorrelation length scales d 0 and effective decorrelation length scales d 0 = d0 • τ (for which ρ k * τ ij = 0.05) estimated from Fig. 6.Scaled decorrelation distances d0 used to derive d 0 are obtained from the generalized analysis in said figure's panels j-l.

Overcast
Clear but de-correlate rapidly with increasing d ij • τ −1 .If the range of v values considered was broadened, the correlation structures would fall within the model envelope for all sky types.However, while the profiles for overcast and clear skies look similar to the model prediction for some specific v, this is not the case for the mixed cloud case.For short scaled distances the correlation decay corresponds to an intermediate value of v in the model -but after that, the correlations drop off much faster.This result demonstrates that the model is not able to capture the correlation structure for mixed sky conditions.

Variability in spatial averages
Averaging clear-sky index increments from different sensors provides an estimate of the output variability of an ensemble of PV installations at multiple locations.In order to assess the effect of area averaging on variability as a function of averaging area A, we employ the following random circle sampling method.First, the borders of the domain corresponding to each field campaign were determined by a convex hull encircling the instruments' coordinates, using the mean minimum distance between sensors as a circumferential padding.Within each of these domains, a number of circles of specified area were randomly placed.Area averages of k * and its increments in a circle were taken if at least 75 % of the circle area was in the domain and the circle included a number of pyranometers specified to maintain a constant sensor density.If there was an excess of pyranometers within a randomly chosen circle, a subset was randomly selected to maintain the specified density.This sampling method was adopted because of the irregular distribution of sensors in the two campaigns.Three representative circles are illustrated for the Jülich campaign in Fig. 7, using circle radii of 1.25 km and a fixed sensor density of 2 km −2 .The time series of k * and k * τ were then spatially averaged over the sensors within each circle for an ensemble of 500 different circles within each of 10 logarithmically spaced area bins, using a constant sensor density of 2 km −2 .Samples were drawn from both field campaigns, with the Melpitz campaign only allowing the use of circles up to about 4.5 km 2 , and the Jülich campaign covering the entire spatial range.Using the median standard deviations of k * and k * .Three representative realizations of randomly selected circles falling within the study domain, using circle radii of 1.25 km, and a fixed sensor density of 2 km −2 .The coordinates (UTM projection, grid zone 32U) of all pyranometers of the Jülich campaign are shown, along with their corresponding convex hull, using a circumferential padding amounting to the mean minimum distance between sensors (about 440 m).Valid circles used to compute areaaveraged quantities must overlap the domain by at least 75 % and include a specified number of pyranometers.If there is an excess of pyranometers within a randomly chosen circle, random subset selection will ensure the adherence to a constant sensor density.
over all sensors as normalizing factors, we define the normalized variability of area-averaged quantities as the relative standard deviations Atmos.Chem.Phys., 16, 6365-6379, 2016 www.atmos-chem-phys.net/16/6365/2016/and Conditioned on the previously described sky types and time lags, summary statistics (cf.Table 1) of these relative variabilities are presented as a function of averaging area in Fig. 8.As circles with areas below 0.5 km 2 contain only single sensors, no averaging occurs in these circles and the median relative variability for this area is 1.Although the standard deviation is not a good measure for extreme fluctuations (due to the non-Gaussian character of k * increment statistics), it offers a convenient way of characterizing typical excursions from the mean and has previously been used in similar contexts, for example by Hoff andPerez (2010, 2012).The standard deviation is also convenient to use because irrespective of the distribution of the data the relative variability will change as n −0.5 if the sensors are uncorrelated, where n is the number of sensors in the circle.The curve for the relative variability of uncorrelated sensors is included in Fig. 8. Variability in averaged clear-sky index decreases much more slowly with averaging area than does variability in increments.The decrease of variability with averaging area is also more rapid for shorter increment times than longer increments, and less rapid for overcast conditions than other sky types.For both τ = 1 s and τ = 10 s, the decrease of σ k * τ closely follows the n −0.5 curve corresponding to uncorrelated sensors.This is also generally true for clear and mixed conditions, but not for τ = 60 s under overcast conditions.This behavior is in agreement with the correlation structures presented in Figs. 4 and 6.As the standard deviation of the sum x + y of two random variables with correlation coefficient ρ xy is the standard deviation of the sum (or mean) of n positively correlated variables is always larger than that of n uncorrelated variables.For quantities with relatively long decorrelation distances, e.g., k * under overcast conditions (cf.panels A1 and B1 in Fig. 4), the effect of area averaging on fluctuations is reduced accordingly.The shorter the decorrelation distance becomes, the closer σ follows n −0.5 .Except for the aforementioned 60 s increments under overcast conditions, the decorrelation scales of increments are small compared to the radii of the circles used to compute the area averages.For example, averages for circles of diameter 1 km (A 0.8 km 2 ) and larger will be influenced when the decorrelation scale is about 1 km or smaller.Marcos et al. (2011b) present a similar figure of the 90th percentile of power fluctuations as a function of PV plant area, based on data from eight different power systems.They find that the smoothing effect follows S −0.5 for small increments of 1 and 2 s, with S denoting the PV plant extension,  1) are computed for 10 discrete areas that are logarithmically increasing in size, and contrasted with the uncorrelated decrease of variability following n −0.5 .The random circle method is used to sample these areas, using 500 different circles for each area bin, and a constant sensor density of 2 km −2 .Samples are drawn from both field campaigns, with the Melpitz campaign only allowing for circles up to about 4.5 km 2 of area, and the Jülich campaign covering the entire spatial range.
while variability decreases less rapidly with increasing increment time lags for τ > 5 s.As the areas of PV power plants are typically densely covered by PV modules, the effect of longer decorrelation distances (e.g., due to increased τ ) and higher spatial correlation coefficients are much stronger in a centralized PV plant than in the distributed scenario represented by our study.Hoff and Perez (2010) have introduced the concept of the dispersion factor to describe these differences in the spatial smoothing effect as a function of the along-wind extension of a set of PV systems, the cloud transit rate and the increment time lag considered.
When considering the implications of clear-sky index variability for PV power, the short-term sky type classifications used throughout this study can be linked to distinct PV power fluctuation levels as in Perpiñán et al. (2013): overcast, clear, and mixed conditions correspond, respectively, to low, intermediate, and high PV fluctuation levels.While overcast conditions may necessitate other power sources to substitute the momentary lack of PV generation, clear and mixed conditions can negatively impact the electrical grid.Under clear conditions, high PV power feedback (i.e., reverse power flow from the distribution grid to the transmission grid) can occur and needs to be managed in areas of high PV penetration (Wirth et al., 2014), whereas mixed conditions may endanger the grid's reliability due to the frequent power ramping of PV systems (van Haaren et al., 2014).In the context of subminute variability, we illustrate the event risk of k * fluctuations by means of increment PDFs in Fig. 5, which quantifies how mixed conditions feature flatter tails and higher probabilities of strong fluctuations than overcast and clear skies.As the variability of the effective irradiance incident on an inclined plane has been reported to be higher on a daily basis than that on the horizontal plane (Suri et al., 2007;Perpiñan, 2009), these strong k * fluctuations may be even higher when considering tilted PV panels.
The relationship between measured irradiance and PV power also includes a smoothing effect, because the area of a pyranometer is very small compared to that covered by a PV system's panels.Thus, the larger the spatial footprint of a considered set of PV systems, the more pronounced the smoothing effect will be, and the lower the necessary temporal resolution of the data may become.When considering many interconnected PV systems in a very large area, e.g., all of Europe, spatial smoothing appreciably reduces the necessary temporal resolution of data (the European Energy Exchange, for example, uses 15 min time steps for electricity trading).At the other end of scale, Marcos et al. (2011b) find power fluctuations of, e.g., up to ±50 % from one second to the next (and changes of more than 90 % for a time lag of 20 s) in 1 Hz power data from a single, relatively small, 48 kWp PV plant.Yordanov et al. (2013b) even argue that the optimal temporal resolution of single-point irradiance measurements should be around 10 Hz, in order not to miss extremely short but relatively high magnitude changes.Thus, a high temporal resolution may not be necessary for large-scale analyses, but it is key to characterize local short-term variability in solar irradiance on the spatiotemporal scales that we investigated.
There are substantial differences between the variability characteristics of single utility-scale PV plants covering relatively small areas, and fleets of distributed systems with similar total capacities, but spanning relatively large areas (Hoff and Perez, 2010).The spatial scales covered by our data sets are representative of both distributed generation and utility-scale power systems, and the spatial autocorrelation structures of k * and k * fields presented in Figs. 4 and 6, as well as the area averaged results of Fig. 8 quantify the smoothing effects.For a scenario of low-penetration distributed generation (two systems per km 2 ), the area-averaged variability in k * accordingly remained under the influence of positively correlated sensors (especially under overcast conditions).In contrast, variability in area-averaged sub-minute k * τ was shown to vary as 1 over the square root of the number of sensors as this quantity is only weakly correlated for the sampled area sizes in Fig. 8. Hoff and Perez (2012) have used hourly satellite-derived irradiance data from three different locations in the United States to analyze two-point correlations of k * increments as a function of pair distance, with samples being separated between 10 and 300 km.Taking relative cloud speed and increment time lag (1 h ≤ τ ≤ 4 h) into account, they proposed a model for ρ k * τ ij (cf.Eq. 7) that implies a linear relationship between distance and time lag for fixed values of the correlation coefficient.Hoff and Perez (2012) provide evidence for this relationship by presenting a linear scaling of station pair distances and time lags for fixed correlation coefficients, based on satellite-derived data (their Fig. 7).Perez et al. (2012) show similar results for decorrelation distances below 10 km and time lags below 15 min, based on virtual pyranometer networks (i.e., single-sensor measurements shifted in time) with temporal resolutions of the single-point measurements being as low as 20 s (their Fig. 5).Our findings indicate that this linear scaling of distance and time lag may not necessarily hold for observed multi-point samples of k * fields at very high spatiotemporal resolutions.Along with the decorrelation distances d 0 , Table 2 also quotes effective decorrelation length scales d 0 = d0 • τ for each time lag.These are derived using scaled decorrelation distances d0 obtained from the generalized analysis of correlation coefficients as functions of distance over time lag (Fig. 6jl).While the values of d 0 agree reasonably well with d 0 for τ = 10 s and τ = 60 s, the results differ substantially for τ = 1 s.When analyzing correlation coefficients as functions of distance over time lag, as in Fig. 6j-l, the curves associated with the different sub-minute τ do not collapse on top of each other when plotted separately (not shown).Instead, high-τ data yield systematically lower correlation coefficients than low-τ data, and scaled decorrelation distances d0 are sorted in descending order along the 1, 10, and 60 s time lags.This sequencing has also been observed, but not discussed, by Hinkelman (2013) (the author's Fig. 9) for 10 s < τ < 600 s, which again suggests that the linear relationship between distance and time lag clearly shown by Hoff and Perez (2012) to govern large scales may not be applicable on very small scales.

Conclusions
With the continual global increase of PV power systems and the inherent weather-induced volatility of their power output, characterizing the underlying irradiance variability in both space and time is important for the planning and reliable operation of future power grids.In the present study, we analyzed spatiotemporal field characteristics of clear-sky index and sub-minute k * increment variability during HOPE for distances between tens of meters and about 10 km.The use of up to 99 synchronized silicon photodiode pyranometers operating at a temporal resolution of 1 Hz allowed characterization of variability in the lower ranges of the relevant space and timescales with unprecedentedly fine resolution (although the experimental time series span mid-spring to mid-autumn and may not be representative of other times of the year).
By means of a simple classification scheme based on clearsky index statistics, we identified overcast, clear, and mixed sky conditions, and subsequently analyzed sub-minute k * increments conditioned on these sky types.Mixed sky conditions, characterized by relatively high spread and intermediate averages of clear-sky index, were shown to feature subminute increment PDFs with flatter tails and higher probabilities of strong fluctuations relative to overcast and clear skies.Of the three cloud types, mixed conditions are the most potentially problematic in terms of short-term PV power fluctuations.Compared to increment statistics computed without conditioning by sky type, the probability of relatively strong k * increments of was approximately twice as high under mixed conditions.
The corresponding spatial autocorrelation structures of k * τ revealed very low correlation coefficients for τ = 1 s, even for short distances.With the smallest intersensor separation bin ranging from 28 to 51 m, and typical cloud speeds between 2 and 10 m s −1 , cloud-induced shadows are not fast enough to cover even the shortest of the analyzed sensor pair distances within a second.For a more robust characterization of the decrease of ρ k * τ ij from 1 to 0 for τ = 1 s, the pyranometer network would thus have to be reconfigured to feature much shorter intersensor distances.
As a proxy for the smoothing effects of distributed PV, spatial averaging was shown to effectively mitigate relative variability in k * increments with increasing areas.While, for example, averaging areas required to reduce σ k * τ in half were nearly the same for τ = 1 s (1.9, 1.7, and 1.5 km 2 , respectively, for overcast, clear, and mixed skies), they differed more between overcast conditions on the one hand, and clear or mixed skies on the other hand, with increasing τ (for τ = 10 s: 3.6, 1.8, and 1.6 km 2 ; for τ = 60 s: 9.1, 2.9, and 2.8 km 2 , respectively, for overcast, clear, and mixed skies).Spatial averaging on the scales under consideration was less effective at mitigating variability in k * .
These initial characterizations of PV-related clear-sky index variability during HOPE can be extended to consider other issues of relevance to solar PV power generation.For example, a more refined analysis of the two-point spatial autocorrelation structures of k * increments could be carried out using actual cloud speed information, and the linear scaling of distance and time lag discussed in Sect.6 could be investigated in detail.Combining these data with coarser-resolution ones, for example satellite-derived, could thereby appreciably extend the spatial domain of the analysis.An evaluation of short-term variability reduction due to temporal averaging could also be a direction of future study, taking advantage of the unusually high temporal resolution of the original data acquisition unit (10 Hz) to critically assess the common practice of using longer temporal averages when measuring irradiance (Ohmura et al., 1998).Moreover, the simple increment procedure used to compute the fluctuations of k * in Eq. ( 6) could be improved by using a wavelet-based approach, as discussed by e.g., Gallego et al. (2013).

Figure 1 .
Figure 1.Panels (a, b) show maps of the coordinates (UTM projection, grid zone 32U) of all pyranometers deployed during the two field campaigns in (a) Melpitz and (b) Jülich.Panel (c) displays a histogram of the combined station pair distances from both data sets.

Figure 2 .
Figure 2. Examples of different spatiotemporal variability in clearsky index k * for three distinct cases of sky types: (a) mostly overcast, (b) mostly clear, and (c) mixed.These representative subsets have been manually selected from the Jülich campaign and span 15 min each.The time series of randomly selected sensors (red curves) are contrasted with summary statistics of field variability, represented as box plots (Table1).

Figure 3 .
Figure 3. Kernel density estimate (KDE) of the joint probability density function (PDF) of mean k * i and standard deviation σ k * i of clear-sky index, based on all available single-sensor data from both measurement campaigns.The overlaid regular grid is used to define particular sky type classes.

Figure 4 .
Figure 4. Spatial two-point correlation coefficients ρ k * ij of clear-sky index k * as functions of sensor pair distance d ij for each of the grid boxes from Fig. 3, based on data from both field campaigns.The summary statistics (cf.Table1) are based on 10 logarithmically scaled bins of d ij , and include only those sensor pairs whose members simultaneously correspond to the same grid box for at least 60 min.The total number of pairs used to derive the statistics is given in each panel.Grid boxes to be subsequently grouped as similar sky types are indicated by colored boxes.

Figure 5 .
Figure 5. Statistics of clear-sky index increments k *τ for different time lags τ , based on data from the Jülich campaign.The first three columns display distributions conditioned on different sky types, while the rightmost column presents the combined statistics for all sky types.The estimated probability density functions of all single sensors (solid black lines) are supplemented with the range of ±1 standard deviation σ k * τ around their means k * τ (dashed black lines), and contrasted to the much narrower distributions of the spatially smoothed average of all pyranometers (solid gray lines).The respective average probabilities P ( k * τ = ±0.5) of single sensors to fluctuate by ±0.5 are quoted in each panel.

Figure 6 .
Figure 6.Spatial two-point correlation coefficients ρ k * τ of clearsky index increments k *τ as functions of sensor pair distance d ij for different time lags τ (a-i), as well as sensor pair distance divided by time lag (j-l).The results are conditioned on different sky types and based on data from both field campaigns.The summary statistics (cf.Table1) are computed with 10 logarithmically scaled d ij bins, and include only those sensor pairs whose members simultaneously correspond to the same sky type for a total of at least 60 min.The colored regions correspond to the model Figure7.Three representative realizations of randomly selected circles falling within the study domain, using circle radii of 1.25 km, and a fixed sensor density of 2 km −2 .The coordinates (UTM projection, grid zone 32U) of all pyranometers of the Jülich campaign are shown, along with their corresponding convex hull, using a circumferential padding amounting to the mean minimum distance between sensors (about 440 m).Valid circles used to compute areaaveraged quantities must overlap the domain by at least 75 % and include a specified number of pyranometers.If there is an excess of pyranometers within a randomly chosen circle, random subset selection will ensure the adherence to a constant sensor density.

Figure 8 .
Figure 8. Normalized standard deviations σ k * σ *τ in clearsky index k * (a-c) and its increments k * τ (d-l) as functions of averaging area A for different short-term time lags τ , and conditioned on different sky types.The summary statistics (cf.Table1) are computed for 10 discrete areas that are logarithmically increasing in size, and contrasted with the uncorrelated decrease of variability following n −0.5 .The random circle method is used to sample these areas, using 500 different circles for each area bin, and a constant sensor density of 2 km −2 .Samples are drawn from both field campaigns, with the Melpitz campaign only allowing for circles up to about 4.5 km 2 of area, and the Jülich campaign covering the entire spatial range.