Decadal-scale relationship between measurements of aerosols, land-use change, and fire over Southeast Asia

Abstract. A simultaneous analysis of 13 years of remotely sensed data of land cover, fires, precipitation, and aerosols from the MODIS, TRMM, and MISR satellites and the AERONET network over Southeast Asia is performed, leading to a set of robust relationships between land-use change and fire being found on inter-annual and intra-annual scales over Southeast Asia, reflecting the heavy amounts of anthropogenic influence over land-use change and fires in this region of the world. First, we find that fires occur annually, but with a considerable amount of variance in their onset, duration, and intensity from year to year, and from two separate regions within Southeast Asia. Second, we show that a simple regression model of the land-cover, fire, and precipitation data can be used to recreate a robust representation of the timing and magnitude of measured aerosol optical depth (AOD) from multiple measurements sources of this region using either 8-day (better for onset and duration) or monthly (better for magnitude) measurements, but not daily measurements. We find that the reconstructed AOD matches the timing and intensity from AERONET measurements to within 70 to 90 % and the timing and intensity of MISR measurements to within 50 to 95 %. This is a unique finding in this part of the world since cloud-covered regions are large, yet the model is still robustly capable, including over regions where no fires are observed and hence no emissions would be expected to contribute to AOD. Third, we determine that while Southeast Asia is a source region of such intense smoke emissions, portions of it are also impacted by smoke transported from other regions. There are regions in northern Southeast Asia which have two annual AOD peaks, one during the local fire season and the other, smaller peak corresponding to a combination of some local smoke sources as well as transport of aerosols from fires in southern Southeast Asia and possibly even from anthropogenic sources in South Asia. Overall, this study highlights the importance of taking into account a simultaneous use of land-use, fire, and precipitation for understanding the impacts of fires on the atmospheric loading and distribution of aerosols in Southeast Asia over both space and time. Furthermore, it highlights that there are significant advantages of using 8-day and monthly average values (instead of daily data) in order to better quantify the magnitude and timing of Southeast Asia fires.


Introduction
Southeast Asia has been experiencing major haze events over the past three to 5 decades due to a combination of increased urbanization (Cohen and Wang, 2014;Cohen and Prinn, 2011) and large-scale conversion of forests by fire (Cohen, 2014;van der Werf et al., 2008;Taylor, 2010;Dennis, 2005).The underlying connections and mechanisms relating the sources and strength of fire-based emissions and observed intra-annual, inter-annual, and inter-decadal variations of fire events, with meteorology, land-use change, and anthropogenic driving factors, are not well understood (van der Werf et al., 2006;Giglio et al., 2006;Hansen et al., 2008;Field et al., 2009).Moreover, recent studies have shown that the impacts these events have on the atmospheric loading of aerosols and the larger climate are becoming greater in both absolute terms and frequency (Langmann et al., 2009;Nakajima et al., 1999;Podgorny et al., 2003;Rosenfeld, 1999).Some of the heaviest events, which previously in the literature were only associated only with strong El Niño in-Published by Copernicus Publications on behalf of the European Geosciences Union.
duced drying events, are now being found to occur in connection with other, less extreme impacts on precipitation and even surface moisture, occurring at various scales including but not limited to the Indian Ocean Dipole (IOD), the Madden-Julian Oscillation (MJO), the shifting of the Intertropical Conversion Zone, mountain-induced waves, the land-sea breeze, and localized convection (Fuller and Murphy, 2006;Wooster et al., 2012;Natalia Hasler and Avissar, 2009;Reid et al., 2013).The fact that so many factors are capable of influencing these large-scale events is likely to make prediction much more challenging, as is seen by the fact that since 2000, there were extreme events of varying intensity, length, and duration occurring in 2002, 2004, 2006, 2009, 2013, and 2014 in southern Southeast Asia, the region covering Indonesia, Malaysia, Singapore, and Brunei and every year except 2003 in northern Southeast Asia, the region covering Thailand, Myanmar, Cambodia, Vietnam, and Laos (Neale and Slingo, 2003;Chang et al., 2005;Aldrian et al., 2007;Cohen, 2014;Wooster et al., 2012).To date, other than Cohen (2014), there have been no other studies that have looked at Southeast Asian fires both robustly and holistically to the extent of being able to reproduce both the extreme and low levels of aerosols at both the monthly and the decadal scale.Furthermore, other than Cohen (2014) and Cohen and Wang (2014) there are no works that have been able to satisfactorily estimate the emissions of aerosols over this region of the world, from the fundamentals and over the entire time period, without scaling or other statistical enhancement techniques to match with atmospheric column measurements such as aerosol optical depth (AOD) and absorbing aerosol optical depth (AAOD).
Knowledge of the spatial and temporal distribution and the magnitude of the emissions and atmospheric loadings is essential for our improved understanding of the environmental impacts of the fires.Emissions of aerosols and gases from these fires include significant sources of black carbon (BC), organic carbon (OC), and ozone and therefore contribute greatly towards impacting human health (Afroz et al., 2003), atmospheric radiative forcing (Wang, 2007;Jacobson, 2001;Ming et al., 2010;Ramanathan and Carmichael, 2008;Cohen et al., 2011), and cloud and precipitation properties (Huang et al., 2006;Tao et al., 2012;Wang, 2013).Furthermore, given the general circulation of the Earth and the lack of precipitation during the dry season in the tropics, coupled with intense localized convection, a large portion of the emitted pollutants will spread widely in space and time, entering into the global-scale circulation patterns (Wang, 2007).Therefore, emissions from these regions during these times of the year may have a significant impact on people and the environment thousands of kilometers away from their source.
AOD can be used to quantify the emissions from the fires, since it is the non-dimensional vertical integral of the atmospheric extinction (the sum of scattering and absorbance) of solar radiation due to aerosols.AOD is useful since it can be measured by a combination of land-based and space-borne instruments (Holben et al., 1998;Petrenko et al., 2012;Dubovik et al., 2000).The extinction is a function of the vertical aerosol mass and size distributions as well as chemical, physical, and optical properties.These values in turn are a function of the emissions and gasses from fires and other various anthropogenic sources, in situ processing, washout from precipitation, and atmospheric transport.Hence, the emissions of primary BC and OC from these fires, coupled with other secondary species, has a functional relationship with the change in the AOD, which otherwise would not have occurred over these fire regions and downwind, at these specific times, if the fires were not present.
This paper uses these relationships and goes one step further to make the link between measurements of landuse change and fires directly with the atmospheric column measurements, with fires the intermediary step between the two.This is because rapid conversion of forests, agricultural lands, and associated waste products by burning is one of the primary sources of aerosols throughout Southeast Asia (Langmann et al., 2009;Miettinen et al., 2013).However, little is known about the exact spatial and temporal distribution of these fires (Fu et al., 2012;Chen et al., 2016;Zhou et al., 2016).Furthermore, the inter-annual and intra-annual variability of biomass burning and its associated underlying mechanisms are also not well understood or constrained by measurements, leading to the current poor understanding of fires impact on the local and global aerosol climatology (van der Werf et al., 2006).Furthermore, Southeast Asia is often covered with clouds, which further complicates detecting both fire and the pollution that comes from it (Miettinen et al., 2013;Giglio et al., 2006;Remer et al., 2013).A few studies have looked at this and give estimates that the emissions are underestimated, up to a factor of 4 times (Giglio et al., 2003(Giglio et al., , 2006;;Petrenko et al., 2012;Cohen, 2014;Cohen and Wang, 2014).
Given that large-scale fires lead to abrupt and definitive changes in the vegetative properties, we employ a set of measures of land-surface properties which have a long time record, such as LAI (leaf area index), NDVI (normalized difference vegetation index), and the number of 1 km by 1 km pixels with a measured fire (fire count).While we know that some changes may be masked, obscured, or otherwise missing, any observed abrupt changes in these variables or the land's properties itself must be linked at a minimum with any observed changes in the AOD itself.Moreover, since the onset and the offset on the Asian Monsoon control the start and end of the fire seasons by rapidly changing from relatively dry to intensely wet and vice versa (Hansen et al., 2008), large-scale changes in the monthly scale precipitation is a proxy for the ability of the fires to occur, as well as washout of aerosols.Therefore, precipitation is also intimately linked with measured AOD over Southeast Asia.This is even more important given that there are only very few studies that have been able to quantify emissions over this region successfully over the decadal scale without resorting to statistical scaling, in relation to measured AOD and AAOD.Furthermore, the few emissions datasets that have been made are not capable of working at a higher frequency than monthly.Additionally, they have not been directly linked to the changes in the landsurface properties that should be driving them (Cohen and Wang, 2014;Cohen, 2014).

Data and methods
Several remotely sensed and surface measurements of the surface land properties (LAI and NDVI), the number of active fires (fire count), aerosol (AOD), and precipitation (rainfall) are used in this study.These are used in conjunction with advanced analytical procedures to determine the regions which contribute the most to the variance of the impact of fires on the atmosphere loading of aerosols as observed by the AOD.This analysis, in addition to its own results, leads to the production of a simple statistical multiyear constrained model, which is shown to be capable of reproducing the AOD as a function of the land-use, fire, and precipitation measurements, even in additional years, and even as tested against measurements of AOD from different sources.All of the details of the measurements used, the procedures and methods employed, and the statistical and analytical techniques employed are detailed below.

Geography
The domain of interest for this study is Southeast Asia, which we define here as the region spreading from 90 to 130 • E in longitude and from 14 • S to 23 • N in latitude (see Fig. 1).The subregion defined as northern Southeast Asia is defined by mostly large continental land masses and a single wet season each year, and consists of Thailand, Myanmar, Cambodia, Laos, Vietnam, and parts of southern Greater China.The subregion defined as southern Southeast Asia is defined by a mixture of land and water, has two wet seasons each year, and consists of Malaysia, Indonesia, Brunei, and Singapore.Maps of the fires in January 2013 and September 2013 respectively are given in Supplement Figs.S6 and S7.

Measured data
For the basic remotely sensed measurements used in the analysis, model construction, and results, we use remotely sensed variables from the MODIS instrument on both the TERRA and AQUA satellites.Measurements of AOD (Levy et al., 2013) are from Collection 6, Level 2 product, swath by swath at 0.55 µm, and consist of both over land and over ocean, cloud-cleared pixels, measured daily with a spatial resolution of 10 km by 10 km at nadir.Each swath of only qualitycontrolled pixels of AOD data, from 1 January 2001 through 31 December 2013, has been interpolated onto a consistent and standardized 0.1 • by 0.1 • square grid.It has been shown that there is an slightly biased uncertainty in the measurement of AOD of −0.02-0.10× AOD to +0.04 + 0.1 × AOD over the ocean and ±0.05 + 0.15 × AOD over the land (Levy et al., 2013;Sayer et al., 2012).However, over this region, the magnitude of the "noisy floor" is small compared to the linear term, given that the AOD in polluted regions goes as high as 1.5 to 2.0.While this linear term seems to not be too small, it is actually quite small compared to the difference between the peaks and the troughs as obtained by the variance maximizing technique.Additionally, as shown by Cohen and Wang (2014) and others, this error is sufficiently small as to not impact the end results, especially when compared with the uncertainties in the current best-generation of models and the dynamics of the atmosphere itself.In these cases, the models tend to be lucky to obtain measurements within a 20 to 30 % range of the measurements and often perform more poorly than this (e.g., Cohen and Prinn, 2011;Cohen and Wang, 2014).AOD is a measure of the vertical sum of the extinction of sunlight (scattering plus absorption) through the atmosphere due to aerosol particles, and it therefore is a function of the atmospheric loading of aerosols, washout from precipitation, and the vertical, size, and optical properties of the aerosols.Hence, there is a physical relationship between measured changes in AOD and the emissions and subsequent in situ atmospheric processing of aerosols.It has been shown that strong spatial and temporal variability in AOD measurements over this part of the world are due to biomass burning from this region of the world, while large measurements of AOD which mostly only co-vary with precipitation (washout) are more consistent with urban emissions (Cohen, 2014;Cohen and Wang, 2014).
To estimate the land-surface and fire responses we also use the measured values of LAI, NDVI, and fire count from MODIS (Nightingale et al., 2008;Yang et al., 2006;Huete et al., 1999;Giglio et al., 2003).Measurements of LAI and fire count (Collection 5.1, Level 2 product) are made on an 8-day average basis at 1 km by 1 km horizontal resolution, while for NDVI the measurements are on a 16-day average basis at 1 km by 1 km horizontal resolution.Each product is then aggregated onto the same consistent and standardized 0.1 • by 0.1 • square grid used for the AOD.All measurements only use data which have been quality assured to be cloud free.However, in this region, there are some optically thin clouds that will not be picked up, and this may significantly bias the measurements of fire count, which are inherently based on IR measurements, but should not be as impacting on LAI and NDVI, which both depend mostly on measurements in the visible bands.
LAI is chosen since it represents the amount of leaf material in an ecosystem and hence is useful both for identifying if there were a sudden change in the amount of vegetation available and its condition (Asner et al., 2003), such as expected after leaves are consumed in a fire.It is geometrically defined as the total one-sided area of photosynthetic tissue per unit ground surface area.LAI values are 0 for bare ground, 1 to 4 for grassland and crops, 5 to 9 for plantations, and as high as 10 for dense conifer forests.One of the large drawbacks of using LAI in this region of the world is that it is hard to analyze the variance in the LAI over areas that are used for non-forest agriculture.This is because the LAI is considerably lower than the tropical primary and secondary forests.Hence, after a burning event, the absolute magnitude of the LAI and hence the amount of variance is lower.However, this is the more robust land-surface variable, given that it uses many of the wavebands from MODIS.For this reason, the variance in the LAI is most helpful in determining deforestation from fire, particularly in regions which are not found to have hotspots.
Fire count determines how many of the pixels within the area have an active fire.It is based on a two factors: first, whether there is a sufficient amount of infrared emissions to determine that there is an absolute detection of a fire of sufficient strength; second, whether the detected surface temperature is sufficiently variable as compared to the surrounding pixels.Given the complexity involved with using infrared and visible streams for the fire count, as well as the possibility of thin clouds obstructing this measurement, we only use quality-assured fire count values, those with a value corresponding to 7 or more.In this study, it is found that the number of fire count can vary from 0 to more than 5000 (with a corresponding value of 8) or more than 600 (with a corresponding value of 9) on a monthly basis.
NDVI is also chosen since it represents a measure of the health of the vegetation.NDVI is mathematically calculated from the visible (VIS) and near-infrared (NIR) light reflected by the vegetation as follows: NDVI = NIR−VIS NIR+VIS .Healthy veg-etation absorbs most of the VIS light that hits it and reflects a large portion of the NIR light.In contrast, unhealthy or sparsely healthy vegetation, such as after being burned, reflects more VIS light and less NIR light.Given this formula, a value close to zero (−0.1 to 0.1) implies that there the land is barren with respect to living and green vegetation, whereas values close to +1.0 correspond to the highest density of healthy green leaves.NDVI is an ideal way to search for the ratio of the magnitude of the variance to the absolute mean.This is because the variance in the change in the health is actually proportionate to the initial value.In this case, while the overall variance is not too much throughout the region, the ratio is considerably high in regions that undergo rapid change such as from burning.However, such changes are not very useful for looking at small changes over large periods of time; more are useful at looking at changes occurring over short periods of time.This is one way to overcome the issue of regeneration, due to either natural regrowth or anthropogenic planting.Furthermore, since the onset of the monsoon brings sufficiently large amounts of precipitation that it usually leads to the end of the fire season (Cohen, 2014;Natalia Hasler and Avissar, 2009), knowledge of the rainfall rate is important.For this, we use TRMM measurements of precipitation, as generated by the 3B42 algorithm.This produces daily average precipitation measurements at 0.25 • by 0.25 • spatial resolution over the areas of interest for this work.
To validate the results, we also use two additional measurement platforms for AOD from AERONET and MISR.From AERONET (Holben et al., 1998) we either use available AOD at 0.55 µm or interpolate the surrounding wavelength-specific measurements to 0.55 µm, at nine different stations located in the region of interest.We use all individual Level 2.0 data points, cloud screened and validated, and then averaged to form a daily value, where a sufficient amount of data are available.At the four stations where there are insufficient data, we use individual Level 1.5 data points.However, before forming the daily average value in the case of Level 1.5 data, we only retain the AOD measurements when the corresponding Ångström exponent is larger than 0.2, giving us reassurance that the product is relatively cloud free.This has been tested by varying the sensitivity from 0.1 to 0.4 (the minimum physically acceptable value must be positive) and there is little change in the end result.Although AERONET is the most precise measurement platform for AOD, it is limited in spatial coverage.Therefore we also use measurements from MISR (Diner et al., 1998;Kahn et al., 2010) of AOD at 0.55 µm, with a monthly temporal resolution and a 0.5 • by 0.5 • spatial resolution.The reason for choosing MISR is that it has a smaller error with respect to AERONET over this region of the world than any other satellite platform (Petrenko and Ichoku, 2013), which allows us to provide spatially distributed validation.Although MISR has a narrower swath width than MODIS in this region of the world, there are actually more data points that are retrieved at the AERONET stations and that the error is lower in comparison to the AERONET Measurements.This is partially due to the fact that it is able to cloud process and clear more efficiently than MODIS due to the spherical fraction.Additionally, the fact that MISR is able to measure AOD levels greater than 2.0 allows it to actually obtain more pixels on a monthly basis over this region than MODIS.However, the major downside is that only at a monthly average or lower is frequency available, with the monthly dataset having from four to eight data points per measurement.It is therefore effective over this region at obtaining a spatial distribution upon which to extend the more precise AERONET results.However, this helps quite a bit with the cloud-clearing statistics.Combining these together allows the use of the higherquality AERONET data as an anchor, where available, to evaluate any errors in the magnitude between the model and the measurements, even away from the source, so long as it is still in the same geographical region (as described below).It also provides a means for investigating how error propagation between various different measurement sources can be quantified.
All of the data used have been taken from January 2001 through December 2013.In the case of remotely sensed data, it was first interpolated (in the case of AOD) or aggregated (in the case of fire count, NDVI, and enhanced vegetation index, or EVI) onto a 0.1 • by 0.1 • square grid, using only qualityassured data.These gridded data sets were then aggregated or interpolated respectively to the temporal resolution used, either 1-day, 8-day, or monthly average temporal resolution, to make them consistent.AERONET measurements have also been taken using whatever data was available over the same respective 1-day, 8-day, and monthly periods, and have been considered to be representative of the entire corresponding 0.1 • by 0.1 • box in which they are located.One of the significant advances of using this approach is the ability to analyze how the results are improved by using data with different temporal variability.

Variance maximizing technique
Aerosol emissions and resulting changes in AOD in the Southeast Asia region mainly comes from two types of sources: urban/anthropogenic and fires.Emissions of aerosols from urban/anthropogenic include those from cities, transportation, and industrial processes, which generally include temporally and geographically regular combustion of coal, oil, and natural gas throughout the year.In contrast, emissions of aerosols from fires, which include clearing of forests, agriculture, peat, and rubbish, are more highly irregular over space and time, preferentially occurring under certain economic conditions as well as during periods of dryness, either due to changes in irrigation or under the influence of various meteorological/climatological conditions (Cohen, 2014).As the ultimate goal of this study is to develop an understanding and constraint on the absolute source of aerosol emissions, and since fire is the most uncertain contribution in this region, the analytical technique must target the large amount of variance in the measured fields of the AOD.Those regions which both contribute the most to the variance of the AOD field as well as correspond to a large annual amount of AOD on an absolute basis are the regions which are most likely fires.A simple check of the geography has been performed to eliminate any false positives that are known to be urban or industrial regions, of which there are at least three in the regions under study: in Vietnam, Indonesia, and Malaysia.However, it is possible that rapidly developing industrial uses of the land, such as new large mill towns in Indonesia (as witnessed by the author), were not fully identified.Further, observed land-use changes were considered to be reasonable if they corresponded to reasonable changes in the values of NDVI and LAI.
To achieve these goals, we first employ the empirical orthogonal functions/principal component analysis (EOF/PCA) technique on the 8-day average AOD product.This is one of the beautiful things about using the EOF approach: patterns in the variance of the data search for the set of the relative maxima.Therefore, since the process searches for the highest and lowest values and gradients in space and time, any unbiased error in the measurements, will not significantly impact the result.Furthermore, the 8-day average product was chosen, so that it could take full advantage of the higher frequency of the MODIS data, when compared with the MISR data.Additionally, the lifetime of the aerosol plume is roughly on order of this period of time, given the low amount of precipitation and the high amount of aerosols lofted into the atmosphere due to the heat from the fires, making source-sink and overall statistical properties robust (Cohen, 2014;Lin et al., 2009Lin et al., , 2014)).
The specific EOF/PCA analysis decomposes the 8-day AOD data F into subcomponents.Each subcomponent is orthogonal to the whole, and can be ordered based on the overall contribution to the fractional amount of the overall variability (Bjornsson and Venegas, 1997).This is done by decomposing the measurements into independent (orthogonal) spatial/geographic modes S i and their associated temporal/time modes T i , as explained in Eqs. ( 1)-( 5), where a ij are the individual measurements (i is the marker indicating the ordered number in latitude/longitude, and j is the individual marker indicating the marker in time), and c i and y i , which are the corresponding decomposed values of the spatial and temporal maps accordingly.

Regression-fit model connecting land-use change to AOD
Along with the analysis, we also employ a simple multivariable linear regression model to predict AOD from measured land-use and meteorological variables.This approach is adapted because of the physical nature of the relationship between these variables.Fires lead to a direct drop in LAI in currently growing vegetation through the combustion process.In the case of agriculture which has already been harvested, the LAI would have previously dropped, while the dried products are left to burn.Similarly, if there is a change in the vegetation/agricultural state after the fire, this should show up by a restored LAI, although at a different magnitude.NDVI would similarly be impacted, as the chlorophyll is combusted along with the plant material that is associated with it.Furthermore, the hypothesized loss of efficiency of the land surface associated with the fires would show up as a lower-frequency change in the NDVI.Due to these reasons, there may be a lag expected between the occurrence of the fire and the change in the land-use variable.However, given the rapid rate of regrowth over this region, and the high degree of cloud cover, it is found that day-to-day passes or information are not very reliable.It is uncertain how much lag would be expected in weekly averaged or 2-weekly averaged products.However, our results have determined that in fact the relationship which is based on no lag produces the best fit.Hence, one of the objectives is to quantify the impacts of different averaging kernels applied to the measurements.
There is also evidence that in regions where dryness is an issue, which it certainly is during the extremes of the dry season throughout Southeast Asia, that NDVI recovers slower than LAI.This would certainly be the case in regions in which peat is being drained or has recently been drained, such as the southern Southeast Asia region, or in regions where there is little to no irrigation, such as the northern Southeast Asian region (Hope et al., 2007;Morawitz et al., 2006;Cuevas et al., 2008).The clear indication here is that the rate of greenness regrowth, as observed by the change in the NDVI, may not relate to the canopy and soil moisture regrowth, which is more related to the LAI due to the additional bands in the NIR.However, in regions in the topics that are either managed or fed by the arrival of the monsoon, this is not expected to be a significant issue, and hence a possible reason why little lag is actually observed.
During the dry season, how dry it is will impact the amount, intensity, and duration of the fires as a whole.In practice, years with wetter dry season or a drier dry season should have a reduction in the intensity of the fires as well as their geographic spread, although it will not necessarily lead to them being altogether suppressed.This relationship is slightly more complex, since there are cases where anthropogenic water due to irrigation, burning occurring on very wet peat, or fast-moving thunderstorms can make the ground quite wet but still continue to burn, thereby leading to an increase in emitted aerosol, and hence AOD, due to a switch of the type of fire from flaming to smoldering (Field et al., 2009;Saatchi et al., 2013).However, these cases are over and beyond the approach taken here and are still not fully understood.It is thought that surface wetness is critical for this switch, although in theory this is partially a function of the LAI, NDVI, and precipitation and hence could be approximated to first order using the approach employed here, with the physical variable itself at least being partially captured (Fisher et al., 2009;Phillips et al., 2010;Wohl et al., 2012).Another advantage is that low-temperature fires, which may otherwise go undetected, can still be represented, since they still impact changes in terms of AOD, LAI, and NDVI.
To ensure that the impact of fires is physically as expected on AOD, in which an increase in fire should lead to an increase in emissions and hence AOD, we employ two different regression equations.Both equations use LAI, NDVI, and precipitation as predictive variables R 2 (Eq.7), while only one R 1 (Eq.6) also uses fire count.The regression coefficients α i , β i , γ i , and δ i are computed by minimizing the root-mean-square errors (RMSEs) of Eqs. ( 6) and ( 7).Using these constrained values, the AOD can be approximated during different seasons or over different areas, such as those which are cloud covered and hence do not show measurements.These reconstructed values are generated and specifically compared against AOD values from other measurement platforms, specifically MISR and AERONET.
Since the nature of the land-use change, the amount of precipitation, the state of native vegetation, and the strengths and timing of the AOD signal are different over the two regions S 1 and S 2 , we compute the fitting for AOD over reach region separately.This helps us better quantify and understand the functional relationships between these variables under the different land-use types, land-use management practices, and climatology, when the fires actually do occur.This is especially important in southern Southeast Asia, where there is stronger year-to-year variability, the issue of cloud cover is much more pronounced close to the equator, there are only very few ground station measurement sites, vastly different sets of anthropogenic land-use policies in different regions, and different magnitudes of fire emissions.
To test separately only the fire-occurring seasons, we define fire activity periods over each region as the days during which T 1 and T 2 , respectively, are above the threshold τ north and τ south .Different thresholds for T 1 and T 2 are tested, based on the percentile P of the time series beneath the point τ if the time series were to be regrouped and sorted.We thus use the points P = {0.90,0.835, 0.75}.Since this method is testing for the extreme values in the AOD variance, or when the fires are occurring, this method proves to be methodologically suitable, as it is further providing a constraint on the more extreme conditions and when the pattern is most significant.The values chosen are not arbitrary, as they are based on the statistical robustness of the magnitude of the fields S i × T i .However, the point of the sensitivity analysis is to quantify at what point the errors in the analytical technique are no longer able to statistically retrieve the maximum contributions to the variance of signal, as compared to just picking up the variance induced by the unbiased errors in the measurements themselves.

Decadal-scale analysis of remotely sensed measurements of AOD and land-surface properties
The subsequent analysis performed using the variance maximizing analytical technique only retains those modes S i , T i that explain at least 5 % of the total variability.This is to ensure that any signal found is larger than the uncertainty in the measurements themselves and hence should be physically relevant.Specifically, since the MODIS AOD uncertainty is 5 %, we need at least 5 % of the variance for a mode to represent something useful (Bjornsson and Venegas, 1997).Using this constraint, there are two modes, i = {1, 2}, that explain the variability in the 8-day AOD measurements (see Fig. 2).
Of the variability in the AOD field, 38 % maps to region i = 1 as shown in Fig. 2a, which we will hence refer to as northern Southeast Asia; 13 % maps to region i = 2 as shown in Fig. 2b, which we will hence refer to as southern Southeast Asia.The next largest mode contributes less than 5 % to the total variance in the AOD field and therefore is indistinguishable from other sources of variability and error, such as nonlinear effects of El Niño, planetary dynamical events such as the MJO, regional dynamical events, small-scale perturbations, short-term anthropogenic events, unaccounted for variations in cloud cover, bias in the data, and new urbanization around the edges of the growing megacities.
The physical relevance of these mathematical modes is established by correlating the computed measured average AOD over the respective regions S i , as a time series, with the respective principal component T i .The modes are found to be highly correlated with both the AOD over northern Southeast Asia (R 2 = 0.86, p < 0.01) and the AOD over southern Southeast Asia (R 2 = 0.86, p < 0.01), as shown in Fig. 2c and  d.
Over northern Southeast Asia there is a partially bi-annual peak, with some years having a single peak and others have two peaks.The major peak, which is the more pronounced or sole peak, occurs every year in the measured AOD averaged over T 1 during the latter part of the local dry season (from mid-February to late April).Looking at the average value of the time series of the AOD measurements over S 1 , it is found that the AOD peaks at the same time as T 1 peaks, and that the average AOD ranges from 0.46 to 0.86, depending on the year.The smaller peak occurs in August and September as shown in T 1 in most of the years (but not in 2008, 2010, or 2011).Similarly, the average of the measured AOD over the region S 1 during the same months and years has a corresponding peak ranging from 0.40 to 0.63 during the years when the second peak occurs.The only disagreement between T 1 and the measured time series of averaged AOD over S 1 occurs during 2003, which has already been noted previously by Cohen (2014), although none of the variables used in this study can explain why.
Over southern Southeast Asia, there is a one-to-one agreement between the peaks in T 2 and the peaks in the averaged measurements of AOD over S 2 , with the peaks occurring in which occur on different scales and have various different size holdings in each case, meaning that small differences in timing, intensity, and duration are to be expected from when the people decide to burn and how long they decide to burn for (Taylor, 2010).There is agricultural and straw burning in Thailand, subsistence burning in Cambodia, forest clearing in Myanmar and Laos, and urban and agricultural expansion in Vietnam, and some of these agricultural regions, especially related to rice, have two crops a year and thus the possibility of being burned more than once (Dennis, 2005;Tipayarom and Oanh, 2007).Thirdly, the dry season here tends to be extremely dry, without even occasional rainstorms.There-fore, any emitted particles tend to have a very long lifetime.Hence, the impact of secondary chemistry is important.This chemistry tends to be very sensitive to the emissions ratios, to clouds, and to any nonlinearly emitted secondary species from urban areas as the plums proceeds downwind.In contrast, in southern Southeast Asia the population is also large, but in many of the places in Indonesia and Malaysia that are source regions the cities are large and well contained, while the countryside is still relatively empty.Secondly, in this region, the major cause of burning is the clearing of primary forests, and much of this is done by a smaller number of large land holders, further reducing the variability.This is especially so on a year-to-year basis; during some years there is relatively little burning at all.Finally, even during the dry season, there is still a considerable amount of small-scale convective precipitation and day-night sea-land breezes and rain.Hence, the lifetime of the particles and secondary precursors tends to be slightly shorter, and the impacts of nonlinear secondary processing are also reduced.Hence, the fact that southern Southeast Asia often has an even higher average AOD means that the emissions must be considerably larger in terms of magnitude from year to year, although not necessarily more variable within each year, as also found in Cohen (2014).
These results are clearly consistent with the time-averaged values of the land-use measurements of LAI and NDVI when averaged over regions S 1 and S 2 respectively (Fig. 2).Over S 1 , we can clearly see that much of the region either has an average LAI which is far too low to correspond to native of secondary forest, implying that the land is now agriculture or there is still a high average LAI value with a corresponding reduction in NDVI, implying that primary forest is being deforested in exchange for some type of commercial agricultural tree crop, such as palm oil, rubber, or wood for paper.However, the region over which this second category is occurring is smaller in size than the first region with the simultaneous decrease in both LAI and NDVI (Huete et al., 2002;Myneni et al., 2002Myneni et al., , 2007)).In contrast, over the region S 2 we find that the LAI is still generally quite high throughout the region of interest, while the average NDVI is falling at an even faster rate than the drop over the smaller region in S 1 in which a similar type of condition is occurring.This is completely consistent with the known large-scale deforestation occurring throughout Indonesia and Malaysia where mostly primary forest is burned and replaced with large-scale agricultural tree-based crops (Dennis, 2005;Phillips et al., 2010;Taylor, 2010;Wooster et al., 2012;Field et al., 2009).
A spatial mapping of the climatological mean and standard deviations of LAI and NDVI over Southeast Asia are displayed in Fig. 3. First, it is observed that the LAI is smaller in average over northern Southeast Asia (LAI = 2.3) then over southern Southeast Asia (LAI = 3.5).Similarly for NDVI, the average value over northern Southeast Asia is (NDVI = 0.61) while over southern Southeast Asia it is (NDVI = 0.70).This is consistent with the knowledge that in northern Southeast Asia the land has been more altered from its base tropical rainforest state (Natalia Hasler and Avissar, 2009;Taylor, 2010).In fact, there is a considerable amount of rice and other agriculture which has completely replaced trees with crops.Also, the pace of forest clearing is quite rapid in those regions which still retain a considerable amount of native forest.The only considerably widespread regions of native forests are left only in Laos and at the frontier regions near the intersection of Laos, Thailand, and Myanmar.

Influence of measured fires
To look at the impacts of measured fires, we fit the relationships between LAI, NDVI, precipitation and AOD in two cases, both with and without the inclusion of the fire count variable using REG 1 and REG 2 .This is done separately over both the northern and southern regions with the corresponding different thresholds.A comparison of the time series of the region-averaged AOD from each EOF region, the four model-predicted AOD values, and the measured averaged AOD is made.The average statistical error and average statistical correlation (coefficient of determination, R 2 ) between the datasets and the regression-fit model-predicted AOD used to determine which threshold τ is ultimately used for the purpose of determining the best-fit coefficients for α i , β i , γ i , and δ i .The resulting statistics are displayed in Table 1.
As expected, including the fire count variable significantly increases the performance of the algorithm in terms of correlations: on average the correlation increases from 70 to 79 % in the northern region and from 66 to 75 % in the southern region.However, there is no improvement in the mean error between the reconstructed data and the original measured AOD.This means that if there is a hotspot measurement available, it will improve the ability to predict the spatial and temporal distribution of the fires, but provides no help in terms of estimating the AOD or emissions.This is physically consistent, since the actual emissions should be a more complex function of the type of burning, the material burned, and the conditions under it was burned, not just the existence of a fire.Additionally, this is consistent because the fire count product only quantifies the likelihood of a fire occurring within the given pixel but provides no information on the intensity of the fire.Furthermore, the results of the fitting of the regression coefficient associated with fire count (Fig. 4) show that the coefficient is strongly positive over the regions where fire is the most important and AOD variability the strongest (regions within the dots).Thus, the results are found to be consistent with what is understood -that fire count is a reasonable predictor of emissions of aerosols from fires, but this factor is only useful as a predictor of the effect, not as a means of understanding the magnitude of the effect.
The best-fit regression coefficients associated with NDVI make more physical sense in the case where the fire count predictor is used REG 1 (Fig. S2a) than in the case where   it is not REG 2 (Fig. S2b).In all cases, the timing is based on the 8-day period in the year, days 1 to 8 being the first data point, days 9 to 16 being the second data point, etc.In this way, multiyear variances in the climatology can be rigorously analyzed.In general, a negative coefficient is found, which implies that regions will lose NDVI as a result of an increase in AOD, which is consistent with the health of the land decreasing during a fire.A similar gain is also found in terms of the best-fit coefficients for LAI in the regions which are not rice dominant (rice has a significantly low LAI so that the signal-to-noise ratio from the satellite product is too low to produce a statistically significant result over these regions).The regression coefficients are thus consistent and for this reason we only refer to REG1 from this point forward.
Making comparisons between the regression constructed AOD and the measured AODMODIS over northern Southeast Asia leads to the determination that in average, using τ north = P75(PC1) as the threshold of fire activity leads to the best results, as shown in Table 1.This leads to the reasonable conclusion that in order to represent the AOD during the fire season well, there must be greater access to data, while to represent the AOD during the non-burning or low-burning seasons, less data are required.This is consistent with the variability being considerably larger during the burning season in both space and time over the region of interest.
However, for southern Southeast Asia, using a very small value of τ south = P12.5(PC2)gives the best statistics.This means that using less data improves the fit during the fire season as compared to the use of more data which better constrains the fit over the whole year.This is not intuitive and is only consistent with the case that either (a) the data are more likely to be of low quality during the burning season (i.e., the data are corrupted by clouds) or (b) there is a considerable amount of data missing during the burning season (which is also possible due to the widespread distribution of clouds over much of both Borneo and Sumatra).This view is also consistent with the year-to-year and decadal scale of variability, wherein some years will have little to no fire, and hence data are required over a considerably longer period of time, including both high-and low-fire years in order to properly reproduce the observed patterns.For the remaining of this analysis, we only consider (1) that the reconstructed data set of AOD over the northern region has been computed by using τ north = P75(PC1) as fire threshold and (2) that the reconstructed data set of AOD over the southern region has been computed by using τ south = P12.5(PC2)as fire threshold.These two data sets will be referred as AOD north, REC and AOD south, REC .

Comparing AERONET measurements over northern Southeast Asia
Seven stations from AERONET are situated within the northern region (Chiang Mai, Pimai, Bac Giang, Nghia Do, Vientiane, Mukdahan, and Ubon Ratchathani) and four stations are located inside the southern region (Jambi, Kuching, Palangkaraya, and Singapore).The location of those stations is displayed in Fig. 3 and Table S1 in the Supplement.Of these stations, three are urban sites located downwind from burning regions: Singapore, Bac Giang, and Nghia Do.The remaining sites are located directly in or adjacent to burning areas.Figure 5 displays the temporal series of the AERONET AOD (black curve) and regression-fit modeled AOD (blue curve) at the seven stations situated within the northern region.Table 2 displays the statistics of the goodness of fit between the measured AOD and the reconstructed AOD respectively, AOD MODIS and AOD north , in terms of reproducing the AERONET measured AOD signal.
The first general observation is that all AERONET stations in northern Southeast Asia have an annual peak in their AOD which occurs during the fire season, from February through April each year.Additionally, each station has a smaller second peak over many of the years, but not annually, occurring in August or September.At the two remote stations: Pimai (Fig. 5c) and Ubon Ratchathani (Fig. 5d), AOD reaches its maximum value of over 0.5 during the fire season, while generally the values are clean throughout the rest of the year except for 2001 to 2006 and 2009, with a second local maximum of around 0.46 in September and October.At Pimai, the AERONET data show high pollution during the fire season every year from 2003 to 2008.The model captures all these events correctly in terms of duration, with the onset and end times slightly off, leading to a correlation of 43 %, with an intensity mean error of −0.12.At Ubon Ratchathani, the AERONET data show high pollution events during the fire season of the years 2010 to 2012.The model captures all these events in terms of duration (correlation of 80 %) but underestimates the intensity by a slightly larger mean error of −0.22.A large peak of high AOD can been seen in September 2012 corresponding to a high-pollution event also observed in Singapore.This peak, which has a maximum AOD value of 0.6, is captured by the model.During the common years of data between AERONET and AOD north (2008 to 2013), we calculate that the model captures the fire season and the pollution that is generated by it in terms of duration (correlation is 64 %) and less well in terms of intensity (mean error of −0.26).Although the error in the intensity is not insignificant, it is still significantly better than most other errors from model studies over heavily biomass burning influenced areas of the world; the mean error is still quite good, since most admit to requiring a scaling factor from 1.7 to as much as 5 (e.g., Wu et al., 2011;Cohen and Wang, 2014;Hodnebrog et al., 2014).There are three stations which are situated at mediumsized urban sites which are also adjacent to or directly upwind from fire burning regions: Chiang Mai, (Fig. 5e), Mukdahan (Fig. 5f), and Vientiane (Fig. 5g).All show a strong annual peak during the fire season from February to April.At Chiang Mai and Mukdahan, which are nearer to the agricultural fires, the maximum value of AOD is around 0.5, while it is around 0.6 at Vientiane, which is located further downwind and hence able to undergo additional secondary processing.Figure 5e, f, and g also show smaller peaks during other parts of the year: from September to October for the years 2001 to 2006 at Chiang Mai, with a maximum AOD value of 0.4; from July/August and to October/November (depending on the years) for the years 2001 to 2007, 2009, and 2010 at Mukdahan, with a maximum AOD value of 0.44; and from September to October for the years 2001 to 2007, and 2009 at Vientiane, with a maximum AOD value of 0.59.The nature of these secondary peaks are not annual in occurrence.At Mukdahan, the AERONET data demonstrate the fire season peak for every year the data exist : 2004, 2006, 2007, 2008, and 2009.The regression-fit model reproduces the high pollution every year (R 2 = 0.69), while also reproducing the intensity correctly in 2007 and 2009.While there is only very sparse AERONET data at Vientiane, the regression-fit model reproduces the signal well (R 2 = 0.64 and RMS = −0.07)(see Table 4).Finally, the model also captures the high pollution events measured in March, April, and September 2012.
As expected, there is a considerable amount of variability at stations which are in or near large urban areas (megacities) due to the combination of both the fire signal as well as local emissions and in situ secondary processing.In particular, the signals at the two stations near to the rapidly growing urban megacity of Hanoi: Bac Giang (Fig. 5a) and Nghia Do (Fig. 5b) are very similar.These stations have a much higher annual average AOD than the other stations in the region, with the daily average value as well as long-term mean measured AOD being frequently larger than 0.4, while the annual high AOD peak has a yearly maximum of at least 0.9 at both of these stations (see Table 3).Figure 5a and b also show smaller AOD peaks (maximum value of around 0.7) during other parts of the year (from July through November depending on the year).During the fire seasons in 2004 and 2007 at Bac Giang, the timing of the high pollution events is well captured by the regression-fit model, in terms of onset, duration, and end time, although the model intensity is underestimated.
In 2006, the southern Southeast Asian fire season produced an extensive and massive amount of emissions T 2 due to extremely dry and warm conditions brought on by the El Niño conditions.Various models and measurements have shown that the fires from these emissions have spread from S 2 throughout the Indian and Pacific oceans (Podgorny et al., 2003).However, we have also found that the signal is clearly present at all of the stations located in S 1 .At Chiang Mai, Mukdahan, and Pimai the intensity of the 2006 season as well as its onset, duration, and conclusion are all well reproduced in both the AERONET measurements and the regression-fit model.Even at the urban megacities Bac Giang and Nghia Do the AERONET measurements also display a high pol- lution peak (AOD = 1.2) around September 2006, while the regression models at both of these stations capture the measured onset, duration, and ending of this event.The only issue is that the magnitude of the regression-fit model AOD underestimates the measured value by as much as 33 % at Bac Giang.
Given the intimate connection between fires and the ensuing rapid changes of the land surface which occur at the same time, as expected, LAI and NDVI have changed at the same locations as the AERONET stations.First, they show a correspondingly higher value during the second, localized peak, than at the major annual peak, with a maximum value of around 0.9 at these stations (see Table 3).Figure 5a and b also show smaller AOD peaks (maximum value of around 0.7) during other parts of the year (July through November depending on the year); see Tables S1 and 3.This indicates that the second peak, which does not occur year to year, may not be attributed to large-scale local burning unless the local fires are much less extensive and thus do not lead to significant change in the land surface, but happen to just be upwind of these measurement stations in these given years, or that the local fires are much more polluting per unit of land-use change, and hence still contribute to the AOD to some extent.The other possible explanations are that the pollution during these times is actually transported from other place or is intensified due to some sort of secondary processing.However, it is also found that these changes in the year-to-year LAI and NDVI values do not vary in a one-to-one manner with T 2 , which has some covariance during the big fire years of 2002, 2004, 2006, and 2009 but not during other years in which the peak occurs, such as 2001, 2003, 2005, and 2007.
Overall, we find that the annual peak in AOD throughout S 1 is clearly due to fires and that this is true for both urban, partially urban, and remote sites.Further, during these fire events, the dominant source contributing to the peak in AOD is from the burning itself, even in urban areas where it may be one of two dominant sources.Additionally, there is a second peak found at these stations, which is smaller in magnitude and only occurs in certain years.This secondary peak is very likely not due to local burning, and instead it is shown that a significant number of these years co-vary with analyzed large-scale fires from region S 2 , indicative of longrange transport.However, since there are a few years during which this is also not the case, it is possible that other sources of long-range transport or secondary production of aerosols, such as from South Asia.

Comparing AERONET measurements over southern Southeast Asia
In southern Southeast Asia, S 2 , the majority of the emissions come from a small number of well-defined major urban centers, transport lines through the waterways, and widespread sources from fires, with much of the region still continuing primary forest or dense secondary forest.As a consequence, the major source of the variation in the AOD is a combination of the emissions from fires and precipitation (as it is the major source of the aerosols removal from the atmosphere).This is demonstrated in Fig. 6, demonstrating a smoother and less variable set of measurements during the wet season than at sites 5 and 6 over northern Southeast Asia.Consequently, the AERONET site in Singapore, the sole large urban area in S 2 , is very different from the other stations of this subregion.Unlike in northern Southeast Asia, in general, the AOD signal in southern Southeast Asia tends to only peak once a year (except for in 2009 and 2014, which are special cases to be discussed later that had two peaks due to primary fire emissions).This primary peak, as shown in T 2 , always occurs during the local fire season from August through October/November, without any additional second peak occurring during a non-burning period, as in T 1 .Effectively, this implies that emissions from S 1 are not contributing to the variance in the measured AOD over S 2 and that long-range transport from northern Southeast Asia is not efficient in contributing to the high peaks in AOD found over S 2 .
Additionally, southern Southeast Asia has an important source of uncertainty and bias in the measurements over the region.Specifically, the impact of intense cloud cover is also determined to be very important, in terms of being able to capture all of the known large-scale fire-based events.We observe that in a few special cases, where known large-scale pollution events have occurred over S 2 as measured both on the ground and by MISR measurements of AOD (Cohen, 2014), MODIS was not able to successfully capture the events (for example, June 2013).A careful examination of the cloud cover fields and fire count measurements shows that this is clearly the case, at least for June 2013; the region S 2 was almost completely masked by clouds (over 80 % of all pixels) in the day-to-day tracks, with more than 90 % of pixels in the 8-day average fields over this period of time being masked.The AERONET station in Singapore is located in a highly urban environment, with sizable sources of aerosol emissions related to shipping, a high energy using population, and refineries.It is clear that there are no wildfires occurring within Singapore.At the Singapore station, we observe an annual signal except for every year, although during the years 2008 and 2010 the signal is less intense than in the other measured years.There is a considerable amount of variation in the magnitude, the onset, and the duration of the peak, as well as a considerable amount of noise.However, the maximum measured AOD here on an 8-day average basis ranges from a low year of 0.55 to a high year of 0.81 in 2006.Even though the fires were quite distant, it is clearly observed that then most intense event in 2006 is readily captured here, further supporting that, even in an urban environment, Singapore offers a reasonable downwind signal site for observing the impacts of the fires.
In contrast, the other AERONET stations in this region, including in Kuching, Jambi, and Palangkaraya, are situated in remote and mostly heavily jungle/forested regions of Borneo and Sumatra islands (see Table S1).These sites are all located close to where the fire sources originate, in the jungles and forests of Borneo and Sumatra.The AERONET station in Jambi, situated on Sumatra Island, has an annual signal of high AOD occurring once a year, every year, except in 2010 (where there were no measurements).However, the magnitude, onset, and duration of these high pollution events is highly variable from year to year.The AOD maximum value ranges from a low of 0.67 (in 2007) to a high of 1.49 (in 2006) (see Table 5).The AERONET station in Kuching has an annual peak signal in AOD every year that measurements are available (there were no measurements during the corresponding peak times in 2008, 2010, and 2013).The magnitude, start, and duration of this peak are again highly variable from year to year, with the maximum in measured AOD ranging from a low of 0.68 in 2007 to a maximum of 1.36 in 2006.At Palangkaraya, which is situated in Western Borneo in Indonesia, there is also a single high peak occurring every year, except for 2010 (which did not have any measurements).Similar to the other stations, the intensity, onset, and duration of the high AOD signal were very variable from year to year.
The regressive-fit model based on the MODIS measurements at each of the remote sites in southern Southeast Asia, Jambi, Kuching, and Palangkaraya, is capable of reproducing the major heavily polluted years as found in the measurements, such 2002 (maximum AOD of 1.24 in Jambi, 1.0 in Kuching, and 1.94 in Palangkaraya), 2004 (maximum AOD of 0.99 in Jambi, 0.85 in Kuching, and 1.18 in Palangkaraya), 2006 (maximum AOD of 1.49 in Jambi, 1.4 in Kuching, and 1.98 in Palangkaraya), and 2009 (maximum AOD of 0.95 in Jambi, 0.87 in Kuching, and 1.02 in Palangkaraya).At Jambi and Palangkaraya, the regressive-fit model reproduces the high AOD event of late 2012 well, with a better correlation with the AERONET measurements (R 2 = 76 % at Jambi and R 2 = 74 % at Palangkaraya) than MODIS AOD at the same grid point (R 2 = 51 % at Jambi and R 2 = 71 % at Palangkaraya), as given in Table 6, although the intensity in these years is slightly low.However, the regressive-fit model reproduces the AOD well in terms of intensity, onset, and duration at Kuching (RMSE of 0.13, R 2 = 66 %) (see Table 6).However, the regressive-fit model is still basically constrained by the cloud cover issue.It is for this reason that the known high values of aerosols in the atmosphere over Singapore in June of 2013 (as based on surface measurements and personal observation) are not captured in AERONET measurements, MODIS measurements, or the regressive-fit model.In addition to June 2013, we also find that MODIS AOD and the regressive-fit model are both not capable of capturing the 2010 fire season peak either.However, the issues of cloud cover seem to be less important in other years, and we find the onset, duration, and intensity are all well matched between the regressive-fit model and AERONET measurements at Singapore during the fire seasons of the years 2007, 2008, 2009, 2011, and 2012 (see Table 6 for statistics).

Comparisons versus measurements from the MISR satellite
MISR satellite measurements of AOD are at lower spatial and temporal resolution than MODIS and AERONET measurements, and thus, to use them as a basis for comparison, the values from MODIS and AERONET will be averaged to a monthly basis as well as at 0.5 • × 0.5 • .Over northern Southeast Asia, the time series of the regression-fit model AOD compares very well with the time series of the average MISR AOD over the same region (R 2 = 0.77 over S 1 , and R 2 = 0.85 over the region of highest variability).While there is some underestimation of the absolute AOD as compared to the MISR measurements, that underestimation is always less than 0.1 and therefore is not far from the order of magnitude of the error in the measurements themselves.One of the important reasons why the agreement is so good is that this region is generally cloud free during the dry season when the fires occur, and hence there is a quite large and representatively similar sampling size between MODIS, MISR, and AERONET during the fire periods in this region.This establishes that indeed the MODIS-based regression-fit model matches well against MISR and is able to reproduce the variability and magnitude of the AOD over northern Southeast Asia (Fig. 7).
Not surprisingly, when fitting the results of the MODIS regression-fit model using 8-day average data, the overall fits are not as good when comparing against MISR.Part of the issue is the additional variability, but more importantly is the lack of sufficient data due to cloud coverage.Specifically, over the region S 1 , the correlation rises from R 2 = 0.66 to R 2 = 0.81 when increasing from 8-day to monthly averaging.Similarly, the comparison between the AERONET data  (2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009)(2010)(2011)(2012)(2013).Regions of highest AOD variability from the EOF analysis are delineated by black dots.Within the northern region the mean correlation is 84.8 % and the average is 77.3 % (mean error is 0.06), while in the southern region the mean correlation is 79.2 % and the average is 72.4 % (mean error is 0.08).The mean errors are given on the left-hand side, while the correlations are given on the right-hand side.
and MISR AOD also increases from R 2 = 0.59 to R 2 = 0.79 when comparing 8-day averages and monthly averages respectively.Overall, the regression-fit model is able to reproduce the variation of AOD at all the stations in northern Southeast Asia, in terms of both duration and intensity concerning high pollution events (see Figs. S5 and S6).
As expected, the spatial comparison between MISR and the regression-fit model over southern Southeast Asia is not as good.The first thing to note is that the spatial extent of the region from MODIS, given with the relatively level of high certainty by S 2 , is considerably smaller than a similar spatial distribution of the smoke extent over this same region, when analyzed in the same way using data from MISR measurements (Cohen, 2014).This is explained in part due to the larger cloud-covered fraction in the MODIS measurements when compared with MISR, as well as the shorter averaging period with the MODIS measurements, leading to a situation where there is insufficient information at each averaging time step over much of the region.It is found that the RMSE between MISR and the regression-fit model ranges from a minor and relatively insignificant (as compared to the measurement errors) model overestimation of 0.1 in AOD to a substantial and significant model underestimation in the AOD of up to 0.5.This regression-fit model underestimation as compared to MISR measurements is significantly larger than the AERONET and MISR disagreement over this region, which is less than 0.3 (Cohen, 2014;Shi et al., 2011), and, further, this error occurs especially and exclusively during the intense fire-burning years.However, the overall temporal correlation between the regression-fit model and the time-averaged AOD from MISR is R 2 = 0.72 overall and is as high as R 2 = 0.79 over the region of highest AOD variability.This means that the inter-annual and intra-annual variation is relatively captured by the MODIS measurements and the resulting regression-fit model.

Conclusions
An in-depth analysis of multiple measurements from MODIS, MISR, TRMM, and AERONET measurements has been performed over a 13-year period over Southeast Asia.Using MODIS AOD, the spatial and temporal patterns of the contribution of fires to the atmospheric loading of aerosols was established.Two distinct regions, with vastly different properties were observed: one in northern Southeast Asia, which had a strong annual signal with some inter-annual vari-ability, and another in southern Southeast Asia, which had a strong signal with inter-annual and intra-annual variability.Northern Southeast Asia shows an annual high AOD during the fire season (varying roughly from February through April), with a smaller nearly annual peak occurring during the exact timing when southern Southeast Asia has its fire season.Southern Southeast Asia is affected every year by their own fires (from roughly August through October), without any observed secondary peak except for during two exceptionally dry years during the second very short dry season in February 2009 and the very end of 2013 (which would maximize in February 2014, although it is beyond the end of the data analyzed in this paper).The representation in terms of the timing of the fires of northern Southeast Asia was consistently good in terms of start time, length of the burning season, and cessation of the burning when compared against AERONET and MISR measurements.The representation in terms of timing over southern Southeast Asia was not as good, but still quite acceptable, when compared against AERONET and MISR measurements, with the duration of the fire season well captured in strong fire years and the strongest part of the fire season captured in low fire years.
Bringing in different simultaneous measurements of landsurface variables, fires, precipitation, and column aerosol measurements allows us to confirm that these patterns exist and are consistent with land-use burning.Given the difference in the timing and durations of the major monsoon seasons over these regions, the results are consistent.From this point, a simple regression-fit model was established to predict the AOD from measurements of land-use change variables, fires, and precipitation, which should be the basis upon which fires start in the environment.These simple regressionfit models (based on MODIS and TRMM) reproduced the onset, duration, and magnitude of the measured AOD from other measured sources (MISR and AERONET) well over northern Southeast Asia.The results of this regression-fit model demonstrate the ability to predict the AOD as observed by AERONET and MISR, using only measurements of land-use change variables and fires from MODIS, and precipitation from TRMM, measurements of some of the important and fundamental underlying factors controlling the fires.
These simple regression-fit models reproduced the onset, duration, cessation, and even the magnitude of the measured AOD from AERONET and MISR very well in northern Southeast Asia.These simple regression-fit models also reproduced the onset, duration, and cessation of the measured AOD from AERONET and MISR well in southern Southeast Asia, especially during the more intense burning years.The main issue in southern Southeast Asia, however, was that the magnitude over this region was strongly underestimated.These results still underestimate the column loading, but by a magnitude of 30 % or less, which is far better than the typical scaling factors applied of 1.7 (70 %) or more, and consistent with the results in Cohen and Wang (2014) and Cohen (2014) which show that there is an underestimate of both the overall magnitude and the fire magnitude, and that correcting for the former leads to an underestimate in the latter of 20 to 30 %.The result is not only larger in magnitude than the GFED (Global Fire Emissions Database) emissions products but also includes regions which are considered to have zero emissions in the GFED data set, a worrying conclusion since a value of 0 cannot be scaled up by a scaling factor.Some reasons for this include emissions sources which are more variable in space and time, such as the clearing of primary forests, peat burning, and rapid development, and increased cloud cover reducing the number of available measurements over large portions of this region by a significant amount.Further, the inter-seasonal periods in southern Southeast Asia tend to be both more rainy and more cloud covered than in northern Southeast Asia due to large-scale convection and other regional disturbances like the MJO and the IOD.
There is a strong and consistent change in the land-use variables occurring during the local fire season over both northern and southern Southeast Asia, although these relationships, as expected, are different over the two regions due to different types of land-use change.The relationships between burning of primary forests, grasslands or crops, and peat should all be different.Additionally, there is an important secondary use for these relationships, determining whether the observed smoke is locally produced of transported from far upwind.For example, it is clearly noted that the land-use changes are much smaller during the second non-annually occurring peak in northern Southeast Asia, implying that while there may be some contribution from local sources, there is also a large amount of smoke transported from other regions.This comes from the idea that if the land itself did not change very much, then the emissions of smoke produced must have been considerably lower.The timing of this smaller peak matches the timing of the fire occurrence over southern Southeast Asia with a very high level of correlation.Additionally, it also cannot be ruled out that the smoke could be urban pollution from South Asia.However, there is no evidence that any of the smoke in southern Southeast Asia originates from any region other than its own sources.
Further, we explored the added value of using higher temporal resolution data, which are usually thought to add improved value.Due to the large amount of cloudiness encountered, there was a much reduced number of measurements available over southern Southeast Asia during the fire season using 1-day average values as compared to 8-day average values, leading to less statistical relevance.In the end, it was not possible to have a reasonable reproduction of the measured AERONET and MISR values of the onset, duration, and ending of the fires using 1-day average MODIS and TRMM data as compared to when using 8day average MODIS and TRMM measurements to develop the regression-fit relationships.Even with the 8-day average data and the associated regression-fit relationships, the magnitude of AOD during southern Southeast Asia's fire season is significantly too low, although in northern Southeast Asia it is low but not more than the magnitude of the uncertainty of the input measurements themselves.The correlation between the regression-fit model AOD and AERONET stations over the entire decadal time period, using 8-day average MODIS data, ranges from R 2 = 0.42 to R 2 = 0.75.While monthly average data from MODIS do not provide as fine resolution for the duration, onset, and end times of the fires, they provide the best match in terms of the magnitude of the AOD measurements from AERONET and MISR.However, when using MODIS data on a monthly average basis, the regression-fit model AOD gives a better performance with the correlation coefficient between AOD and AERONET stations ranging from R 2 = 0.70 to R 2 = 0.90.Furthermore, the correlation over the regions of interest S 1 and S 2 between the regression-fit model and MISR measurements of AOD ranges from R 2 = 0.57 to R 2 = 0.81.This is due partially to less underrepresentation of very high short-term peaks, as well as additional data points being available in the MODIS fire and land-use products at longer average time durations.This is a counterintuitive result, with many in the community stressing the added value of higher-frequency measurements, but is consistent with the fact that such spaceborne measurements are severely limited by clouds over this region of the world during the fire season.MISR has shown to represent the magnitude of the AOD well, with the measurements from monthly average MISR measurements and monthly average AERONET measurements being basically the same.Therefore, the ability of the regression-fit model to capture the monthly average AOD from both MISR and AERONET, in terms of both the inter-annual and intra-annual variability in the fire seasons, is significant and shows that the changes in the land surface and the impacts of precipitation are indeed driving the atmospheric loading of AOD and hence the impact of the fires over this region on the decadal scale.Further, as it is widely known, peat can burn and smolder for an extended period of time after any measured fire has gone away; therefore, by extending the average value for the fire, it allows for a better matching with the total emissions, which will continue to often be produced for weeks after any visible flame or surface heat is observed.Thus, one of the important findings is to examine the most ideal temporal resolution at which to use the data, whether it be daily, weekly, or monthly.While most of the published literature leans towards using high-frequency daily data (or individual swathby-swath data, where available), we determine and validate that using weekly or monthly average data leads to a better ability to accurate reproduce the measured values, explain why that is the case, and then quantify some of the impacts and limitations of this result.
This study highlights the importance of taking into account land-use variable and precipitation for estimating AOD correctly both in time and magnitude, even though magnitude remains hard to capture on a 8-day basis.One significant bias in the magnitude of the results must be due to problems of the relationships over the region being not properly captured, such as the different anthropogenic driving forces of the landclearing being significantly different over the two regions.A second significant bias in the magnitude is due to the fact that there is a significantly more cloud cover over the two regions during their local burning seasons (Giglio et al., 2003).These results support the efficacy of the approach introduced here: that it is appropriate to use measured changes in the land, precipitation, and active fires from MODIS and TRMM to reproduce a working model of the atmospheric aerosol loading.Furthermore, other than Cohen (2014) and Cohen and Wang (2014) there are no works that have been able to satisfactorily estimate the loadings of or AOD associated with emissions aerosols over this region of the world without using some type of scaling.This method is able to reproduce the magnitudes by introducing physical parameterizations of scaling, and doing so based on a more fundamental driverbased approach.This allows us to improve our understanding of the relationships, both in terms of how they vary over space and time and in terms of physical drivers.
AERONET data used are Level 2.0, except for the four stations mentioned that have no Level 2.0 data, and in these cases Level 1.5 data are used.The data can be found at http: //aeronet.gsfc.nasa.gov.
MODIS data used for AOD are Collection 6, Level 2. The data can be found at https://modis.gsfc.nasa.gov/data/.
MODIS data used for NDVI, LAI, and fire count are Collection 5.1, Level 2. The data can be found at https://modis.gsfc.nasa.gov/data/.
TRMM data used are from Collection 3B42.The data can be found at https://pmm.nasa.gov/data-access/downloads/TRMM.
Specific details of the various data sets used, as well as assumptions made in selecting and handling the data, are provided in Sect.2.2.
The Supplement related to this article is available online at doi:10.5194/acp-17-721-2017-supplement.

Figure 1 .
Figure 1.Domain with the two EOF regions highlighted and the location of the AERONET stations.

Figure 2 .
Figure 2. First line: EOF1 (38.2 % of variance) (a) and EOF2 (13.3 % of variance) (b) of the AOD (2001-2013).Regions of highest AOD variability are delineated by black dots.Second line: PC 1 (cutoff 0.006, PC on the left-hand axis, AOD on the right-hand axis) (c) (red curve) and their associated AOD (green curve) averaged on the region.Third line: PC 2 , (cutoff 0.01, PC on the left-hand axis, AOD on the right-hand axis) (d) (red curve) and their associated AOD (green curve) averaged on the region.

Figure 3 .
Figure 3. Climatological values of LAI (first column) and NDVI (second column) for the 2001-2013 period.Average values are displayed on the first line, while the standard deviation is displayed on the second line.

Figure 4 .
Figure 4. Regression coefficients(δ 1 ) associated with fire count for REG 1 .Regions of highest AOD variability from the EOF analysis are delineated by black dots.

Figure 7 .
Figure 7. Basic statistics between MISR and AOD north (a, b) and AOD south (a, b) on a monthly basis(2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009)(2010)(2011)(2012)(2013).Regions of highest AOD variability from the EOF analysis are delineated by black dots.Within the northern region the mean correlation is 84.8 % and the average is 77.3 % (mean error is 0.06), while in the southern region the mean correlation is 79.2 % and the average is 72.4 % (mean error is 0.08).The mean errors are given on the left-hand side, while the correlations are given on the right-hand side.

Table 2 .
Statistics over the respective northern and southern regions compared to the AERONET stations.Overlapped periods between the reconstructed AOD AODNorth/South and AERONET are stated in parenthesis."Fire" denotes data analyzed only during the fire season, while "all" denotes the entire data set.
* Not statistically significant at the p = 0.05 level.

Table 3 .
Average values of maximum AOD and average LAI and NDVI during the two annual AOD peaks over the northern region for the 2001-2013 period.