Interactive comment on “ CLARA-A 2 : The second edition of the CM SAF cloud and radiation data record from 34 years of global AVHRR data

The manuscript "CLARA-A2: The second edition of the CM SAF cloud and radiation data record from 34 years of global AVHRR data" by K.-G. Karlsspn et al. fits the scope of the journal and deserves to be considered for publication after some changes are made. The paper is reasonably well written and understandable. Results and figures are provided with a sufficient quality. Major comments The manuscript contains several statements on the improvements of CLARA-A2 with respect to the previous version CLARA-A1. While this is fine and informative for po-


Introduction
Global distribution of cloudiness and existing cloud feedback on the radiative forcing continue to be important topics in climate research.Uncertainties in the description and under-standing of both topics are considered to be dominant in explaining the spread among climate models in their prediction of current and anticipated climate change (Webb et al., 2013;Vial et al., 2013).In parallel, better knowledge and monitoring of global cloudiness and radiation are also required for a successful increased utilization of renewable energy sources, such as solar power plants (Šúri et al., 2007).In order to address requests and challenges in these and adjacent fields by a systematic utilization of satellite measurements, the Climate Monitoring Satellite Application Facility (CM SAF) was formed by the European Organisation for the Exploitation of Meteorological Satellites, EUMETSAT (Schulz et al., 2009).
CM SAF (www.cmsaf.eu)aims to develop capabilities for a sustained generation and provision of climate data records (CDRs) derived from operational meteorological satellites.The ultimate aim is to make the resulting data records suitable for the analysis of climate variability and the detection of climate trends.Examples of important guidelines for the compilation of CDRs are (1) to apply the highest standards and guidelines as outlined by the Global Climate Observing System (GCOS), (2) to process satellite data within a true international collaboration benefiting from developments at international level, and (3) to perform intensive validation and improvement of the CM SAF CDRs, including a major role in data record assessments performed by research orga- The figure shows ascending (northbound) equator crossing times for all afternoon satellites from NOAA-7 to NOAA-19 and descending (southbound) equator crossing times for all morning satellites (NOAA-12, NOAA-15NOAA-17 and METOP A+B).Corresponding nighttime or evening observations take place 12 h earlier or later.Some data gaps are present but only for a number of isolated dates.
nizations such as the World Climate Research Programme (WCRP).
One of CM SAF's CDRs is CLARA: "The CM SAF Cloud, Albedo And Surface Radiation dataset from AVHRR data".It is based on data from the Advanced Very High Resolution Radiometer (AVHRR) operated on-board polar orbiting NOAA satellites as well as by the MetOp polar orbiters operated by EUMETSAT since 2006.AVHRR offers one of the longest satellite observation records, with its first measurements commencing in 1978.The first edition of CLARA (CLARA-A1) was released in 2012 and is described by Karlsson et al. (2013).This paper describes improvements and other features of the second edition, CLARA-A2, which was released in 2017.
The basic AVHRR radiance measurements were previously described in detail by Karlsson et al. (2013).Consequently, Sect. 2 describes only the extension of the data series since CLARA-A1 and some further modifications to improve calibration and homogenization of the entire data record.Section 3 includes general descriptions on how the data record was compiled and Sects.4 through 6 explain the most significant improvements made to retrieval methods for the three different groups of parameters (clouds, surface albedo and radiation) together with some validation results.For the latter, some focus has been on extensive intercomparisons being made to space-borne active lidar cloud re-trievals (CALIPSO-CALIOP) and to other existing satellitebased data records (e.g.PATMOS-x and MODIS).Finally, Sect.7 summarizes the main features of the data record and presents future plans.

Extension and homogenization of the historic AVHRR data record
The basic AVHRR radiance measurements (level-1 observations) used in CLARA-A2 are described in detail by Karlsson et al. (2013).However, the temporal coverage is now extended with 6 additional years (2010)(2011)(2012)(2013)(2014)(2015) resulting in a total length of 34 years .Figure 1 illustrates all satellites and their respective measurements periods for the CLARA-A2 climate data record.It is clear that the observational coverage varies considerably; there is only one satellite in orbit providing measurements during the 1980s and early 1990s, while during the last decade at least four simultaneous satellites were present (with a peak of six satellites available simultaneously in 2009).Further, orbital drift for individual satellites leads to changing local observation times, and this contributes to varying observational conditions during the period.However, some sub-setting of the data could still yield relatively homogeneous observation conditions.For example, through exclusively choosing afternoon satel-lites (which is possible with the CLARA-A2 data record), a quite homogeneous and stable time series of observations can be achieved.
The AVHRR instrument was initially built for operational global weather monitoring purposes, not for climate monitoring.This means that the radiometric accuracy and the stability of radiance measurements are sometimes problematic for some early satellites in the time series.In addition, NOAA archiving of data has its own problems with intermittent occurrences of gaps, duplications and corrupt data, depending on time period and satellite.Consequently, a substantial effort in the preparation of CLARA-A2 has been made to correct and homogenize the entire radiance (level 1) record.A special pre-processing tool (PyGAC) was developed for these purposes, described in detail by Devasthale et al. (2016b).Some of the most important aspects have been the following: -Removal of corrupt data -Data rescue of data with incorrect header definitions -Removal of duplicated orbits -Removal of overlap between orbits -Homogenization of visible calibration by removal of trends and performing inter-calibration techniques between satellites (based on the method by Heidinger et al., 2010, but extended with more satellites and with MODIS Collection 6 as reference data) -Improving accuracy of infrared calibration (compared to CLARA-A1) by using a more accurate treatment of calibration target data -Applying median filters to AVHRR channel 3b (at 3.7 µm) brightness temperatures for reducing the impact of high noise levels for satellites NOAA-7 to NOAA-14 -Removal of partially corrupt orbits in periods with AVHRR scan motor problems (primarily between the years 2001 and 2005; this was mostly based on manual inspection efforts since operational data flagging does not sufficiently cover this problem).
The overall impact of these treatments resulted in the exclusion of approximately 6 % of all original level-1 data in the NOAA archive from processing.The work with improving the AVHRR level-1 data record (or the fundamental climate data record -FCDR) has been performed within the framework of the WMO project SCOPE-CM (http://www.scope-cm.org/)and the ESA Cloud_cci project (http://www.esa-cloud-cci.org).
3 Product overview highlighting changes in product aggregation since CLARA-A1 The CLARA-A2 CDR is based on instantaneous AVHRR global area coverage (GAC) retrievals (i.e. for every orbit at approximately 4 km horizontal-swath resolution in nadir) which have been aggregated to derive the final spatiotemporally averaged data records.Since CLARA-A1, an important change for the cloud products is the introduction of globally resampled daily composites (level-2b) as the basis for computation of final level-3 products.The level-2b approach leads to a significant reduction of the amount of used observations.However, the high observation frequency near the poles is undoubtedly very valuable, and, consequently, there are also separate polar products added which are based on all available observations.The level-2b approach is used exclusively for cloud products and not for surface radiation and surface albedo products where the use of all existing data is more critical.
Final level-3 cloud products are available as daily and monthly composites, where the monthly means are computed from daily means.Results are defined for each satellite on a regular latitude-longitude grid with a spatial resolution of 0.25 • × 0.25 • .In addition, results for cloud amount as well as the surface albedo (see Sect. 5) are available on two equalarea polar grids at 25 km resolution for the Arctic and Antarctic regions; these grids are centred at the poles and cover areas of approximately 9000 km × 9000 km.The new features for CLARA-A2 include the availability of all daily level-2b products and a demonstration data record of probabilistic cloud masks (further explained in the next section).
Monthly averages of cloud products are also available in aggregated form (i.e.merging all satellites).Acknowledging the different observation capabilities during the night and during the day, and also taking existing diurnal variations in cloudiness into consideration, a further separation of some products into exclusive day and night portions has been per- formed as a complement to the standard products that are based on all data.For these complementary products, all observations made under twilight conditions (solar zenith angles (SZAs) between 75-95 • ) have been excluded in order to avoid being affected by specific cloud detection problems occurring in the twilight zone (e.g.Derrien and LeGleau, 2010).
All products described in the following three sections are described in detail in product user manuals (PUMs), algorithm theoretical basis documents (ATBDs) and validation reports (VALs), all available via the CM SAF web user interface (accessible from www.cmsaf.eu).These documents are important as they describe and reference the latest algorithms utilized in the processing of the CLARA-A2 data record; the peer-reviewed publications of retrieval algorithms referred to in the following Sects.4-6 may not always be up to date with these very latest algorithm changes.

Cloud products
A list of all CLARA-A2 aggregated cloud products is given in Table 1.These products have been derived from the pixellevel retrievals of the respective cloud properties, which are also made available in the form of level-2b products, as outlined in Sect.3. Basic methods for deriving these parameters can be found in Karlsson et al. (2013).Consequently, the following sub-sections only provide a brief introduction to the products, list the most significant improvements since CLARA-A1 and introduce some new features.

Improvements to basic cloud products derived from the NWCSAF cloud processing package
The cloud fractional cover (CFC) product is derived directly from results of a cloud screening, or cloud masking, method.
CFC for one particular instantaneous observation is defined as the fraction of cloudy pixels per grid box compared to the total number of analysed pixels within that grid box, expressed as a percentage.This product is calculated using the NWC SAF polar platform system (PPS) cloud processing software (Dybbroe et al., 2005).CFC is also prepared in complementary day-time and night-time conditions.The PPS method also computes the cloud top level (CTO) product, providing the cloud top level as geometric height, cloud top pressure and cloud top temperature.The CTO retrieval uses two different radiance-matching methods, one for clouds identified as opaque and one for semi-transparent clouds.
CLARA-A2 takes advantage of some significant upgrades of the cloud masking and CTO retrievals in the latest PPS version.Generally, the utilization of reference measurements from the CALIPSO-CALIOP sensor (Winker et al., 2009;Vaughan et al., 2009) has been fundamental for the development and validation of the methods, following approaches by Karlsson and Dybbroe (2010) and Karlsson and Johansson (2013).The most important improvements regarding cloud screening include the following: 1. PPS dynamic cloud masking thresholds have been adjusted, guided by cloud optical thickness information provided by CALIPSO-CALIOP, to detect a larger fraction of the thinnest clouds.Thus, thresholds for AVHRR visible reflectances and infrared brightness temperature differences (the latter often sensitive to presence of semi-transparent clouds) have been optimized.3. New dynamic thresholds for infrared-brightness temperature-difference features have been introduced, in particular for the differences relative to the 3.7 micron channel over arid and semi-arid regions.
For CLARA-A1 some static thresholds were used previously which led to occasional false cloudiness and unrealistic cloud distributions and trends over semi-arid regions, as reported by Sun et al. (2015) and Sanchez-Lorenzo et al. (2017).The new thresholds are functions of surface emissivity (from MODIS climatologies) and viewing angles.The impact of these changes is illustrated in Fig. 2, which shows changes between CLARA-A1 and CLARA-A2 over the African continent.Clear reductions are shown over semi-arid regions, while for other regions changes are close to neutral or slightly positive.
The challenging cloud screening conditions near the poles have received special attention.Cloud detection during polar day conditions over snow-and ice-covered surfaces has been optimized, and falsely detected clouds during polar night conditions have been largely removed.In both cases the access to CALIOP cloud masks and CALIOP-estimated cloud optical thicknesses has been extremely valuable.Various validation scores have been studied and PPS thresholds were adjusted to optimize the scores.The removal of falsely detected clouds unfortunately leads to a systematic and enhanced (compared to CLARA-A1) underestimation of cloudiness over the Arctic and Antarctic regions during polar night.However, we are of the opinion that this better reflects the true cloud detection limitations of the AVHRR sensor in situations with very cold ground temperatures, compared to the previous case with frequently occurring false cloudiness.
Figure 3 compares results from CLARA-A1 and CLARA-A2 using global, synoptic surface observations (SYNOP) of cloud cover.For this study, the CLARA-A2 monthly mean product, generated from all available satellites, was compared against SYNOP monthly-mean cloud cover calculated based on daily means.Only those stations and months where at least 6 observations per day for 20 days of the respective month were included in the comparison (see the VAL report for more details).Results show relatively small changes in CFC bias but a substantial decrease in the bias-corrected root mean squared error (bc-RMSE) for CLARA-A2.Thus, a much better agreement with the SYNOP-observed variability in cloud cover is achieved.The relatively unchanged bias reflects inherent and unavoidable differences in the viewing geometry for the two observation types.
The improvements in cloud detection are also reflected in comparisons with cloud observations from the CALIPSO-CALIOP instrument.Karlsson and Johansson (2013) compared CLARA-A1 results with CALIPSO-CALIOP data for 99 selected NOAA-18 orbits and we have repeated this study for CLARA-A2 with overall global results provided in Table 2; the validation scores (bias, Kuipers and hit rate) are explained in Karlsson and Johansson (2013).The number of matched fields of view (FOVs) differs slightly here despite using the same 99 matched orbits, which is explained by some pixels being masked out for quality reasons (in the polar areas) in the CLARA-A1 data record.We notice that more clouds are detected (bias being reduced) in CLARA-A2 despite the fact that some previous false classifications over semi-arid regions are now removed.Thus, both more clouds are detected and the cloudy-cloud-free separation has improved, indicated by improved Kuipers and hit rate scores.
Figures 4 and 5 demonstrate the achievements made in cloud detection efficiency in CLARA-A2 in much more detail.Results are based on an extensive cloud product monitoring effort utilizing near-simultaneous (i.e.within 3 min) observations from the CALIPSO-CALIOP sensor over nearly 10 years (2006)(2007)(2008)(2009)(2010)(2011)(2012)(2013)(2014)(2015).Despite the nadir-only observation capability of the CALIOP sensor compared to the wide-swath coverage from AVHRR, it has been possible to collect a global picture of cloud detection efficiency by accumulating results over the relatively long time period.Figure 4 shows the overall global frequency of correct cloudy and cloud-free estimations (hit rate).In this figure we have only considered CALIOP-detected clouds with vertically integrated optical  depths exceeding 0.15 (thinner clouds being treated as cloudfree cases).This is done to avoid being overly influenced by the presence of sub-visible clouds (explaining a large part of the negative bias values in Table 2) that are beyond detection capability in passive imagery.Results show a general global agreement in cloud screening well above 80 %, apart from over the poles and high-latitude land areas and over high mountainous terrain.Decreased hit rates are also found over the marine sub-tropical regions near the climato- Results were derived from the same dataset as in Fig. 3. logical centres of sub-tropical highs.We suspect that this is mainly attributed to increasing geolocation mismatches between AVHRR-and CALIOP-observed clouds being present on the AVHRR GAC sub-pixel scale (less than 4 km).In other words, true small-scale or fractional clouds may exist in any of the two inter-compared data records but not always simultaneously in both because of the small sizes of cloud elements.These mismatches increase when facing conditions with a larger proportion of sub-AVHRR pixel-scale cloudiness, and such conditions are likely to occur in the central regions of the marine sub-tropical highs.Here we encounter more scattered cloudiness, occurring either as individual cumulus clouds or as broken stratocumulus or cumulus clouds in open cell formation (as described by Stevens et al., 2005), while away from these regions clouds organize more frequently as closed cells or as extensive stratocumulus cloud decks.Cases with a more dominant appearance of small cumulus and/or fractional stratocumulus can also occur over land surfaces but the more heterogenic conditions over land are likely to create a more diversified distribution of cumulus clouds in different stages of development and size.Nevertheless, the lower hit rates also observed over the eastern part of South America and in eastern Africa may also be explained by a high frequency of sub-pixel-scale cloudiness.
The poorer results seen over regions where cold surface conditions may prevail for considerable portions of the year (Fig. 4) is a potentially more serious issue.Figure 5 exemplifies this by showing the probability of detecting cloudy conditions over the Arctic.Over the coldest portions of Greenland and the inner Arctic, almost 50 % of the clouds remain undetected in CLARA-A2 during the polar winter.On the other hand, cloud screening complications are reduced during the polar summer when results are nearly as good as over any other region on Earth (excluding some highly elevated areas of Greenland).
Comparisons have also been made to the MODIS (Moderate Resolution Imaging Spectroradiometer) sensor Collection 6 data record (http://modis-atmos.gsfc.nasa.gov/products_C006update.html) from the Aqua satellite (Fig. 6).Generally, we find a very good agreement between the two data records, both in the geographical distribution and in the zonal averages, of global cloud conditions for the overlapping data records of 2002-2014.There is a bias of about 5 % in cloud cover (MODIS is higher) that is relatively constant over all latitudes (Fig. 6, lower panels).This increased cloudiness in MODIS is interpreted as representing the improvements in spectral channel availability of the MODIS sensor in comparison to AVHRR.However, the very good correlation with MODIS results is encouraging considering the availability of two more decades of results from AVHRR.
CLARA-A2 also includes a demonstration data record of probabilistic cloud masking following Karlsson et al. (2015), defined in the level-2b data record.The alternative formulation here provides a measure of uncertainty in cloud masking for the user to consult, compared to the traditional binary cloud mask utilized when compiling level-3 CFC products.The intention is to shift entirely to a probabilistic formulation in the third edition of CLARA, planned for release in 2021.
The CTO retrieval in CLARA-A2 has been subject to several minor modifications while retaining the same principle methodology.However, the most significant improvement is related to an optimization of the iterative procedure leading to a substantial efficiency leap regarding the fraction of resulting valid retrievals.The previous method in CLARA-A1 was not able to provide valid estimations for all semitransparent clouds, where only approximately 70 % of all cloudy cases yielded valid CTO retrievals.The new PPS version used in CLARA-A2 processing provides CTO estimations for more than 97 % of all cases.This is especially important for the joint cloud-histogram product (JCH, see Sect.3.3) and its ability to reflect true climatological conditions.The improvement has resulted from applying more physically sound constraints to the iterations (i.e.seeking the best physically reasonable solution instead of seeking the best solution and discarding it if it's not physically reasonable).
Regarding possible applications of the updated and extended CLARA-A2 data records of CFC and CTO products, it is clear that the extension of the CLARA-A2 data record with 6 additional years increases the chances of detecting possible changes in global cloudiness patterns (see also Section 5 where specific changes in the Arctic region are illustrated).Such changes on the global scale are indicated by many of the climate scenarios from climate models being used as input to the fifth assessment report (AR5) from the Intergovernmental Panel on Climate Change (IPCC) in 2014.The question of whether such changes can already be seen in observational data records has been addressed recently by Norris et al. (2016) Stephens (1978).Uncertainty estimates of the CPP products are also derived and provided.
Major updates compared to the CPP version applied for CLARA-A1 (Karlsson et al., 2013)   ation of improved cloud reflectance LUTs and the inclusion of observational sea ice (OSI SAF, 2016) and ERA-Interim reanalysis (Dee et al., 2011) snow-cover data to better characterize the surface albedo.It should be noted that, since CPP retrievals require reflectances from shortwave channels, CPP products, apart from CPH, are available exclusively during day-time (i.e.not during twilight and night).Since CPH is retrieved both during night and day, a complementary CPH (day) product is also provided.
Figure 7 shows the CPH, LWP and IWP products averaged over the 5-year period 2003-2007.Large-scale climatological characteristics of clouds are apparent, including the marine stratocumulus regions off the west coasts of the continents, the Inter-Tropical Convergence Zone (ITCZ), consisting mainly of ice clouds, and the mid-latitude cyclone tracks in both hemispheres.High cloud water-path values over polar regions should be largely attributed to inadequate retrievals over snow-and ice-covered surfaces, providing little contrast with clouds in the AVHRR visible channel.
Inter-comparison efforts with other similar data records show a general agreement better than 5 % for CPH (i.e. for absolute frequencies of water clouds) and 0.005 kg m −2 for LWP and IWP, although the bias compared to DARDAR IWP is larger.More details on these results can be found in the VAL report.
Further illustrations of LWP and IWP results are given in Figs. 8 and 9. Figure 8a shows the monthly time series of LWP in the tropics from CLARA-A2, along with CLARA-A1 and two other satellite-based data records.Between them, PATMOS-x is the most similar to CLARA-A2, since it covers the same period and was based on the same (AVHRR) measurements.MODIS, on the other hand, covers the last 12 years of the time series and is the most stable, since it involves a single (here, MODIS on Aqua is used), wellcalibrated instrument.In general, the LWP records agree well in terms of absolute amount of tropical LWP, except for a large bias in the case of CLARA-A1 (Fig. 8a).However, an improvement is apparent from CLARA-A1 to CLARA-A2, with the latter showing better agreement with PATMOSx and MODIS.This difference between CLARA-A1 and CLARA-A2 is attributed mainly to changes in CPH, due to the implementation of a new retrieval algorithm.In terms of seasonal variability, all data records agree well, and differences between the two CLARA editions are minor (Fig. 8b).Both CLARA-A2 and PATMOS-x show some trends during various parts of the time series, which are primarily attributed to orbital drift.
It should also be noted that during the period January 2001-May 2003, channel 3a of AVHRR on-board NOAA-16 was switched on and used for the retrievals, instead of channel 3b, which was used throughout the rest of the time series.This switch causes a jump in the time series of both CLARA-A2 and PATMOS-x.Comparisons of LWP were also made against an independent, microwave-based data record (O'Dell et al., 2008), focusing on the main stratocumulus regions, where liquid clouds prevail (not shown).Results showed good agreement in both the seasonal cycle and absolute values of LWP, with an average bias of −0.0034 kg m −2 , fluctuating in the range ±0.01 kg m −2 .Furthermore, Fig. 9 shows a validation of pixel-level CLARA- A2 IWP with Cloudsat-CALIOP-based DARDAR observations (Delanoë and Hogan, 2008).An overall underestimation by CLARA-A2 is observed, which becomes larger at high IWP values.Further analysis indicates that this disagreement is mainly caused by differences in REF (especially for thick clouds), while COT agrees well between the two data records (not shown).

Multi-parameter cloud product representations
The JCH product is a combined histogram of CTP and COT covering the solution space of both parameters (e.g.Rossow and Schiffer, 1991).This two-dimensional histogram gives the frequency of occurrences for specific COT and CTP combinations defined by a constant bin space, separable for liquid and ice clouds.This product is defined on a slightly coarser grid (1 • × 1 • resolution) in order to achieve higher statistical significance and to maintain manageable file sizes.The product is currently archived on the grid-point resolution, so user-defined JCH analysis regions can be created.Since the JCH product is a product visualization technique, its quality is dependent on the quality of the visualized products, including CTO (here, cloud top pressure), COT and CPH.Improvements to those products have already been described, but we repeat some of the important points here: -The increase in the number of valid CTO results gives a better representation of the true CTP-COT distribution.
-The histograms are now based on cloud products defined in the level-2b representation mode, giving a more homogeneous and consistent data distribution.
-Frequencies of occurrences in each bin as well as the total cloud cover for all cases are now given (although the latter can still deviate slightly from the CFC product value since for JCH we require all three products -CTO, COT and CPH -to be simultaneously available).
Figure 10 shows global CLARA-A2 JCHs for afternoon satellites together with corresponding results from Aqua-MODIS Collection 6, PATMOS-x and CLARA-A1 over the period 2003-2014 (i.e. the Aqua-MODIS era).Notice, however, that there are no CLARA-A1 data after 2009, leading to a shorter period with data (2003)(2004)(2005)(2006)(2007)(2008)(2009) for that data record in Fig. 10.In comparison to global JCH results for CLARA-A1 (Karlsson et al., 2013 and bottom panels in Fig. 10), we highlight that clouds are now more frequent at higher and lower tropospheric levels.This agrees well with MODIS and PATMOS-x, although the latter two have more boundary layer clouds present, especially over open water (Fig. 10d-i).
Over land, MODIS and PATMOS-x distributions show an increased frequency of mid-and high-level clouds, and a reduction in shallow cumulus and stratiform clouds (Fig. 10f,  i).A relative increase in very optically thick mid-and upperlevel clouds, representative of nimbostratus and deep convec-tion, also emerges for MODIS and PATMOS-x.CLARA-A2 distributions generally agree with these distribution changes, although with CLARA-A2 there is a tendency to observe a higher frequency of optically thinner clouds (COT ranging 0.3-3.6)across the tropospheric column (Fig. 10c).Furthermore, there is a substantial amount of optically very thick mid-to upper-level clouds in CLARA-A2 and PATMOS-x (Fig. 10c, i), which are largely absent in MODIS (Fig. 10f).In CLARA-A2, this feature is linked to problems in estimating COT properly over snow-covered surfaces and therefore COT products over these surfaces should be treated with caution.A JCH where the Antarctic continent was masked resulted in the removal of this relative peak of high COT at mid-to high cloud levels in CLARA-A2 (not shown).

The surface albedo product
The cloud mask and AVHRR radiance data have been used as primary input data to generate the CLARA-A2 surface albedo (SAL) product of terrestrial black-sky surface albedo (wavelengths of 0.25-2.5 µm).It is available as pentad (fiveday) and monthly means and has the same spatial resolution and projection as the other CLARA-A2 products.Examples of the CLARA-A2 SAL product for January and July 2012 are given in Fig. 11.
The retrieval algorithm of CLARA-A2 SAL follows the same outline as the previous CLARA-A1 SAL described in detail in Riihelä et al. (2013): after cloud masking, the possible effect of topography on geolocation and radiometry in locations with inclined slope is corrected.Then, for the pixels on land, a correction for scattering and absorption effects of aerosols and other atmospheric constituents is performed.In CLARA-A2 SAL, a dynamic aerosol optical depth (AOD) time series has been used as input for the atmospheric correction.It has been composed using the total ozone mapping spectrometer (TOMS) and ozone monitoring instrument (OMI) aerosol index data (Jääskeläinen et al., 2017).In CLARA-A1 SAL, a constant AOD value of 0.1 was used as input for the atmospheric correction, but with the new AOD time series a more realistic temporal variation of atmospheric corrections is achieved.A correction for reflectance anisotropy of vegetated surfaces and spectral albedo is then calculated.In CLARA-A1, one land use classification (LUC) was used for the whole time series.For the current SAL product, four different LUCs are used.Finally, a narrowband-tobroadband conversion is made to derive the albedo over the full spectral range of the product (0.25-2.5 µm).Since the reflectance anisotropy of snow is large and varies according to snow type (Peltoniemi et al., 2005), the albedo of snowand ice-covered areas is derived by averaging the broadband bidirectional reflectances of the AVHRR overpasses into pentad or monthly means.These overpasses are found to cover the whole viewing hemisphere (SZA smaller than 70 • and satellite zenith angles smaller than 60 • ) in most of the cases, 0.0 -+--------'-----'---_,__--....__-___.___ _,__ ___ 0.0 -+--------'------'---_,__--...__-___.___ _,__ __ giving a good representation of the bidirectional reflectance distribution function.For the observations over open water, the albedo is constructed as a function of SZA and wind speed.Wind information is taken from microwave measurements (SMMR and SSM/I data) and available SYNOP observations.The classification between open water and sea ice has been verified using the Ocean and Sea Ice Satellite Application Facility (OSI SAF) sea ice extent data (Eastwood, 2014).
In summary, the main differences in the algorithm between CLARA-A1 SAL and CLARA-A2 SAL are as follows: atmospheric correction uses dynamic AOD time series, the number of LUCs used has been increased from one to four, wind speed data are used over the sea to describe the sea surface roughness.
The data record has been validated against in situ albedo observations from the Baseline Surface Radiation Network (Ohmura et al., 1998), the Greenland Climate Network (Steffen et al, 1996) and the TARA floating ice camp (Gascard et al., 2008).The sites have been chosen according to data availability, temporal coverage of measurements and quality of data.The validation results show that CLARA-A2 SAL has a relative accuracy of 10-20 % over vegetated sites, and typically 3-15 % over snow and ice.Larger differences between the in situ measurements and the satellite-based albedo value are mostly related to the heterogeneity of high-resolution near-infrared surface reflectances on CLARA-A2 SAL pixel scales.The spatial representativeness is an issue at most of the sites and should always be considered when using measurements of different scales (and locations) for validation (Riihelä et al., 2012).
The SAL time series was also compared to MCD43C3, the surface albedo product from MODIS (Schaaf et al., 2002).The comparison showed that on a global scale, the two products are in good agreement.An overview of the MODIS comparison results for both CLARA-A2 and CLARA-A1 SAL can be seen in Fig. 12, showing the mean black-sky albedo.These data have been averaged over the common retrievable land and/or snow area after coarsening the MODIS product to 0.25 • spatial resolution and averaging the CLARA SAL pentad means to fit the MODIS products (delivered as 16day means).Water areas are excluded from the analysis since the MODIS product is not defined for water bodies (including sea ice areas).The CLARA-A2 and MCD43C3 products are in good agreement and generally the albedo differences are less than 5 %, especially during the latter half of 2009.In general, the difference is caused by the methodology differences, where the MODIS albedo product is normalized to local noon, which, for surfaces other than snow, produces the minimum daily albedo.Taking this into consideration, CLARA-A2 SAL values are expected to be slightly higher than the MODIS product values.An analysis of the differences on latitudinal bands (not shown) shows that over the northern hemisphere, the largest differences appear over Arctic land areas.The topography (which is corrected for in SAL but not in the MODIS product) also creates differences in the average albedo over mountainous regions.The albedo values of CLARA-A2 SAL are considerably closer to MODIS albedo values than CLARA-A1 SAL values are.This has been achieved by using dynamic AOD in the atmospheric correction of CLARA-A2 SAL.
The temporal stability of the CLARA-A2 SAL time series has also been evaluated using the central part of the Greenland ice sheet (not shown) as a site whose albedo is expected to remain fairly constant over a long period (Riihelä et al., 2013).The results showed that the maximum deviation of monthly-mean CLARA-A2 SAL over this site from its 34year mean was 8.5 %, including some natural variability associated with, for example, varying SZA.Also, the 34-year mean albedo for this site was estimated to be 0.786 which is Table 3. Validation results of the CLARA-A2 surface solar irradiance (SSI) data record (monthly mean/daily mean) against the global data from the BSRN network; for reference, the corresponding values for CLARA-A1 are also given.Shown are the number of months/days, the bias and the absolute bias as well as the correlation of the anomaly between the two CLARA data records and the BSRN data.The means are calculated only over those land and/or snow surfaces that are retrieved in both products; the MODIS product is not defined for water bodies, thus they are excluded from this analysis.No weighing for irradiance or area has been applied.The relative differences between the CLARA products and MCD43C3 are shown with a grey (CLARA-A2) and green (CLARA-A1) dashed line.
somewhat lower than the literature citations for the albedo of dry fresh snow (0.85, Konzelmann and Ohmura, 1995).
The cloud mask used in CLARA-A2 SAL is less conservative than the one used in CLARA-A1 SAL.This is likely to affect the SAL values, especially during non-continuous cloud conditions.Also, the inaccuracies in the land cover data record used to resolve the CLARA-A2 SAL algorithms may cause retrieval errors.The users are recommended to utilize the existing support data (for example number of observations, standard deviation, mean SZA, skewness and kurtosis per pixel) to remove suspect retrievals from their analysis.
Our quality assessment of the CLARA-A1 SAL surface albedo data record has shown that SAL retrievals over snow and ice, particularly over the Arctic, are of good quality (Riihelä et al., 2010).Also, according to user feedback, the data  record has been useful for climate model validation (e.g.Light et al., 2015).The retrieval over snow and ice is essentially the same in CLARA-A2 SAL as it was for the previous edition of the data record which gives reason to believe that the user feedback and quality assessment should, to some extent, also be valid for CLARA-A2.The validation results against in situ observations and comparison with MODIS MCD43C3 product show that adding the new AOD time series for land areas has improved the algorithm performance elsewhere as well.
To further illustrate the SAL and cloud products and their possible applications, we will consider the following question: how have surface and cloud conditions changed in the Arctic region over the last 3 to 4 decades?Similar studies on Arctic surface albedo variations alone have already been made based on CLARA-A1 data (Riihelä et al., 2013), but access to a longer time series of observations (including the new record year, 2012, in Arctic minimum sea ice extent) and the coupling to cloud processes clearly motivate continued studies in this field.Many climate predictions and scenarios point towards the existence of an Arctic amplification (e.g.Cohen et al., 2014) of the regional temperature rise due to several large positive feedback effects; two of these effects are the decrease of sea ice cover and its interaction with cloudiness.The good AVHRR observation conditions during the polar summer season (e.g. as pointed out in Sect.3.1 discussing Fig. 5) now permit more in-depth studies of these two aspects.
Figure 13 shows the mean change of SAL for the first decade of the CLARA-A2 period (1982)(1983)(1984)(1985)(1986)(1987)(1988)(1989)(1990)(1991) compared to the last decade (2006)(2007)(2008)(2009)(2010)(2011)(2012)(2013)(2014)(2015) over the high-latitude Northern Hemisphere summer months.The corresponding changes in mean cloud cover are shown in Fig. 14.We can clearly see the strong SAL signal associated with Arctic sea ice decline since the 1980s, which is very evident in all months from May through September.Corresponding changes in Arctic cloudiness (Fig. 14) are, however, not as equally systematic or well depicted.This is not surprising since cloud conditions depend primarily on atmospheric circulation patterns.However, we notice a tendency for increased cloud cover over the marginal ice zone and the new ice-free regions of the inner Arctic in the months July-September while some decreases in cloud cover can be seen over the remaining ice-covered parts (e.g.close to Greenland and the Canadian Archipelago).This result, based on long-term CLARA-A2 data, supports the findings by Devasthale et al. (2016a) regarding a similar co-variability between cloudiness and seaice concentration observed in the last decade.An interesting feature in April-May is also the increase in cloud cover in the inner Arctic region, while cloudiness appears to decrease outside of this area.However, further studies are needed to investigate the significance of these patterns and the possible links to changes in circulation and radiation conditions.For these purposes, the entire CLARA-A2 data record (i.e. also including the years 1992-2005) must be used.

Surface radiation products
The retrieval algorithms to derive the CLARA-A2 surface radiation products have only undergone minor changes since CLARA-A1.Details on the algorithms are given in Karlsson et al. (2013).Thus, this section presents a few validation results of the CLARA-A2 surface radiation data records.

Surface solar irradiance
The spatial data coverage of the surface solar irradiance (SSI) data has been substantially improved.In CLARA-A2, only snow-covered surfaces are excluded due to a reduced accuracy of the SSI data under these conditions.The validation against surface reference measurements from the Baseline Surface Radiation Network (BSRN) documents the improved accuracy of the CLARA A2 surface irradiance data record, mainly due to the improved cloud detection (see Table 3).
Figure 15 presents the comparison of the decadal linear trends derived from the CLARA-A2 SSI data record with the corresponding trends derived from measurements obtained from the BSRN.To assess the validity of the linear trend derived from the CLARA data record, only surface stations with continuous observations covering at least 10 years of measurements are used.
The trends derived from the CLARA-A2 surface irradiance record correspond well to the trends derived from the BSRN measurements, indicating the high stability of the satellite-derived product and its suitability to calculate temporal changes and trends (Fig. 15).For most BSRN stations, the decadal trend is positive during the considered time period.Note that the time period for which BSRN measurements are available differs between the stations; consistent time periods were used to compare the CLARA-A2 SSI data record with the BSRN measurements at each station.
Figure 16 presents the spatial distribution of the decadal linear trend between 1992 and 2015 based on the CLARA-A2 SSI data record over Europe and a portion of Northern America.To limit the impact of the missing data during the first decade of the CLARA-A2 SSI data record due to the availability of only one AVHRR instrument, the trend was derived starting in 1992, when at least two AVHRR instruments were available.In both regions there is an overall positive trend in surface irradiance, consistent with surface observations (e.g.Wild, 2012).

Surface longwave radiation
The CM SAF CLARA-A2 data record provides information on the surface longwave downwelling (SDL) and outgoing (SOL) radiation in order to enable studies of the full surface radiation budget.Both data records are dependent upon the surface longwave radiation records from the ERA-Interim reanalysis (Dee et al., 2011); using topographic information and the monthly mean cloud fraction from CLARA-A2, the ERA-Interim data are downscaled to match the spatial resolution of the CLARA-A2 data record.For SOL, this means a pure downscaling of ERA-Interim data.For SDL, an effective cloud factor is derived, based on ERA-Interim differences in clear-sky and all-sky downwelling longwave radiation, as well as reanalysis cloud fraction (Karlsson et al., 2013).This factor is then downscaled to CLARA-A2 resolution and multiplied by the CLARA-A2 satellite-derived cloud fraction.The result is a hybrid estimate of combined satellite-reanalysis SDL.
Table 4 shows the validation results of the monthly-mean CLARA-A2 SOL and SDL data records compared to measurements obtained from the BSRN network.The improved cloud mask in CLARA-A2 led to a substantial improvement of the data quality of SDL data record relative to the CLARA-A1 data record.

Summary and future plans
We have described the CLARA-A2 dataset -an improved 34-year cloud, surface albedo and radiation budget data record based on data from the AVHRR sensor on polar orbiting operational meteorological satellites.Major improvements in both the underlying AVHRR radiances and in the retrieval schemes have been described, together with some validation results.Regarding the latter, we have selected a limited glimpse at the exhaustive results created through the extensive validation efforts that have been conducted.More results and analyses are planned in follow-on papers.Some typical applications have also been demonstrated to encourage such studies using CLARA-A2 data records.We would also like to highlight the broadening of the CLARA portfolio of products which now also includes daily aggregated and resampled orbits (level 2b) and the existence of an experimental data record on probabilistic cloud masks.Related to this is the development of a CLARA-A2 cloud dataset COSP simulator (Bodas-Salcedo et al., 2011).This simulator will take into account artefacts in the satellite observations and make adequate corrections for viewing and observation conditions to give a more realistic inter-comparison of results between climate models and CLARA-A2 cloud products.
A continuation of this work has recently been secured by the EUMETSAT approval of the third continuous operations and development phase (CDOP-3) of the CM SAF project covering the years 2017-2022.This means that a third edition of CLARA (CLARA-A3) is planned for release by the end of the CDOP-3 phase.This would be the last edition based entirely on original AVHRR data, including data from METOP-C (the last polar satellite carrying the AVHRR instrument).Furthermore, it will include an extension of the dataset with data forward in time for the years 2016-2020 and backward in time to 1978 (including data from the AVHRR-1 sensor, starting with the Tiros-N satellite), which means it will cover more than 40 years in time.The product dataset will then also be extended with top-ofatmosphere radiation products and the original AVHRR radiances (level 1) will take advantage of a revised infrared calibration (following Mittaz and Harris, 2009), in addition to the upgraded visible calibration.

Figure 1 .
Figure 1.Day-time equator observation times for all satellites covered by CLARA-A2 from NOAA-7 to NOAA-19 and METOP A and B. The figure shows ascending (northbound) equator crossing times for all afternoon satellites from NOAA-7 to NOAA-19 and descending (southbound) equator crossing times for all morning satellites (NOAA-12, NOAA-15NOAA-17 and METOP A+B).Corresponding nighttime or evening observations take place 12 h earlier or later.Some data gaps are present but only for a number of isolated dates.

2.Figure 2 .
Figure 2. Mean difference in cloud fraction between CLARA-A2 and CLARA-A1 for the common period 1982-2009 over the African continent.

Figure 3 .
Figure 3. (a) Time series of mean monthly and annual cloud fraction for CLARA-A2 (blue), CLARA-A1 (red) and SYNOP (black), (b) biascorrected RMSE and (c) bias for the entire period 1982-2015.See text for further details.

Figure 4 .
Figure 4. Overall global frequency of correct cloudy and cloudfree estimations (often referred to as the hit rate) derived from nearly 10 000 collocated (within 3 min) near-nadir AVHRR and CALIPSO-CALIOP orbits in the period 2006-2015.The hit rate was calculated after discarding CALIOP-detected clouds with cloud optical thicknesses below 0.15.Results are collected in a Fibonacci grid with 28 878 grid points evenly spread out around the Earth approximately 150 km apart.The resulting grid has almost equal area and almost equal shape of all grid cells.White spots are cells with insufficient coverage of collocations.

Figure 5 .
Figure 5. Probability of detecting cloudy conditions over the Arctic region during the polar winter (a) and during the polar summer (b).Results were derived from the same dataset as in Fig.3.
include the implementation of the new cloud-phase algorithm in the NWC SAF PPS software package (first made in PPS version 2012 and for the latest improvements in PPS version 2014), the gener-

Figure 6 .
Figure 6.Intercomparison of CLARA-A2 and MODIS Collection 6 (Aqua part) cloud fraction over the covered MODIS period, 2002-2014.(a) CLARA-A2 global cloud cover (CFC).(b) MODIS global cloud cover.(c) Scatter plot of the two data records.(d) Latitudinal distribution (zonal means) of cloud cover from the two data records (CLARA-A2 in red and MODIS in blue).

Figure 7 .
Figure 7. (a) Fraction of liquid clouds relative to total cloud fraction, (b) all-sky liquid water path and (c) all-sky ice water path, averaged for the 5-year period 2003-2007.All data come from CLARA-A2 level-3 products, derived from afternoon (NOAA-16 and NOAA-18) satellite measurements.Water paths are measured in kilograms per square metre (kg m −2 ).

Figure 9 .
Figure 9. CLARA-A2 (NOAA-18) IWP vs. DARDAR IWP (kg m −2 ) for the months January and July 2008.The yellow line depicts the median and orange lines the 16th and 84th percentiles of the CLARA-A2 distribution at the corresponding DARDAR IWP.The greyscales indicate regions enclosing the 10, 20, 40, 60 and 75 % of points with the highest occurrence frequency.

Figure 10 .
Figure 10.Global JCH relative frequency distributions (colours, %) of CTP (hPa) and COT for all months for four data records: CLARA-A2 (a-c), MODIS Collection 6 (d-f), PATMOS-X (g-i) and CLARA-A1 (j-l).The covered period is 2003-2013, except for CLARA-A1 which only covers the period 2003-2009 (no data after 2009).Left column contains the JCHs over sea and land surfaces (sea + land), middle column over sea-only surfaces (sea) and right column over land-only surfaces (land).Histogram frequencies are normalized to unity, such that each histogram sums to 100 %.

Figure 11 .
Figure 11.Global monthly mean surface albedo for July 2012 (a).Corresponding plots for two polar grids are shown at the bottom of the figure; one for the Arctic region (b) and one for the Antarctic region (c, but observe that the month here is January instead of July).Regions without values are grey-shaded (here resulting from dark conditions prevailing close to Antarctica during the polar winter).All albedos given as a percentage (%).

Figure 12 .
Figure 12.Comparison of surface albedos from CLARA-A2 SAL (blue line) and CLARA-A1 SAL (black line) pentad composites with MODIS MCD43C3 (red line) results for 2009 (unit is per cent).The means are calculated only over those land and/or snow surfaces that are retrieved in both products; the MODIS product is not defined for water bodies, thus they are excluded from this analysis.No weighing for irradiance or area has been applied.The relative differences between the CLARA products and MCD43C3 are shown with a grey (CLARA-A2) and green (CLARA-A1) dashed line.

Figure 16 .
Figure 16.Decadal linear trend (W m −2 (dec) −1 ) of the surface irradiance from 1992 to 2015 based on the CLARA-A2 SSI data record in (a) central Europe and (b) parts of Northern America.
Heidinger et al., 2014)sentation (introduced byHeidinger et al., 2014)is motivated by the inhomogeneous global coverage of polar sunsynchronous satellite data.Each polar satellite offers approximately 14 evenly distributed observations per day for each location near the poles, while at the equator, each location is observed only twice, approximately 12 h apart.The purpose of the level-2b data representation is to form a more homoge- neous data record, having only two observations at the most nadir-viewing angle per day per satellite for each location globally.The alternative of using all available observations for level-3 products (as was done for CLARA-A1) results in a skewed distribution of the observations because of the inhomogeneous observation frequency (increasing with latitude).By selecting only the observations which are made closest to the nadir condition, we ensure that observations are made at almost the same viewing conditions and, most importantly, observations are made at nearly the same local time globally for each level-2b product.
Norris et al. (2016)sed on CLARA-A2 data are encouraged here; note that 7 more years of data are available in CLARA-A2 compared to what was used byNorris et al. (2016).
Nakajima and King (1990) cloud thermodynamic phase (CPH), cloud optical thickness (COT), particle effective radius (REF) and liquid water path or ice water path (LWP or IWP).Since 2012, the CPP package is included in the NWC SAF PPS cloud processing package.CPH is determined from a cloud-typing approach followingPavolonis et al. (2005).This cloud-type algorithm consists of a series of spectral tests applied to infrared brightness temperatures.It has a nighttime branch, as well as a day-time branch in which shortwave reflectances are also considered.COT and REF are retrieved using the classicalNakajima and King (1990)approach, which is based on the principle that cloud reflectance is mainly dependent on COT at a non-absorbing, visible wavelength and on REF at an absorbing, near-infrared wavelength.In the CPP algorithm(Stengel et al.,