Estimating daily surface NO 2 concentrations from satellite data – a case study over Hong Kong using land use regression models

Land use regression (LUR) models have been used in epidemiology to determine the fine-scale spatial variation in air pollutants such as nitrogen dioxide (NO2) in cities and larger regions. However, they are often limited in their temporal resolution, which may potentially be rectified by employing the synoptic coverage provided by satellite measurements. In this work a mixed-effects LUR model is developed to model daily surface NO2 concentrations over the Hong Kong SAR during the period 2005–2015. In situ measurements from the Hong Kong Air Quality Monitoring Network, along with tropospheric vertical column density (VCD) data from the OMI, GOME-2A, and SCIAMACHY satellite instruments were combined with fine-scale land use parameters to provide the spatiotemporal information necessary to predict daily surface concentrations. Cross-validation with the in situ data shows that the mixed-effects LUR model using OMI data has a high predictive power (adj. R2 = 0.84), especially when compared with surface concentrations derived using the MACC-II reanalysis model dataset (adj.R2 = 0.11). Time series analysis shows no statistically significant trend in NO2 concentrations during 2005–2015, despite a reported decline in NOx emissions. This study demonstrates the utility in combining satellite data with LUR models to derive daily maps of ambient surface NO2 for use in exposure studies.


Introduction
It has been shown (WHO, 2013) that ambient exposure to outdoor nitrogen dioxide (NO 2 ) has long-term health impacts stemming from cardiovascular and respiratory illnesses.In rapidly urbanizing countries such as China the cost of poor air quality is especially high (e.g.Chen et al., 2012;Gu et al., 2012).In particular, the Hong Kong Special Administrative Region (SAR) has seen significant economic growth in recent decades, which has resulted in the emergence of photochemical smog events caused by increased nitrogen oxide (NO x ) emissions.These effects have been further exacerbated by transported emissions and pollution from the nearby Pearl River Delta (PRD; Xue et al., 2014).It has previously been estimated that air quality improvement from the annual average to the lowest pollutant levels of better visibility days, comparable to the World Health Organization (WHO) air quality guidelines, would lead to 1335 fewer deaths a year over this region, with a saving of over USD 240 million in both direct costs and productivity losses (Hedley et al., 2008).
Reliable exposure assessment requires constructing accurate maps of average pollutant concentrations.However, concentration data are often sourced from sparse in situ measurements which are typically from regulatory monitoring networks.Mapping pollutant exposure therefore requires the spatial interpolation of these measurements over a fine scale, taking into account known emission sources and sinks to estimate the true pollutant distribution.A possible technique to achieve this interpolation is land use regression (LUR; Hoek et al., 2008), in which concentrations measured by in situ stations are correlated with predictor variables such as traffic or population density using a geographic information system (GIS).A multivariate linear regression model is constructed based on significant covariates, which can then be used to estimate the pollutant concentration elsewhere.
LUR models are considered to be advantageous, as unlike dispersion modelling they do not require detailed information about atmospheric conditions as input data.As they are Published by Copernicus Publications on behalf of the European Geosciences Union.
based on linear regression, LUR models are computationally inexpensive to run compared to dispersion modelling.Previously, LUR models have been used to model species such as NO x and particulate matter over spatial scales ranging from cities to countries (e.g.Beelen et al., 2013;Eeftens et al., 2012;Chen et al., 2010;Meng et al., 2015).However, most LUR models are limited by their temporal resolution, and are typically used to determine seasonal or annual concentrations.Methods to improve the temporal resolution of LUR models often involve rescaling temporally coarser models based on trends observed in regulatory monitoring data.
In addition to in situ networks, NO 2 can also be measured from space by satellite instruments (Monks and Beirle, 2011).Satellite datasets have some advantages over in situ networks, in that their long service life and revisit time can provide long-term monitoring of major emission sources and ambient atmospheric conditions, allowing for synoptic coverage of both spatial and temporal variation over urban areas.However, these instruments are only capable of measuring tropospheric vertical column densities (VCDs), and so cannot be readily compared with in situ concentrations without accurately modelling the NO 2 vertical profile to separate the above-ground contribution (e.g.Bechle et al., 2013).Also, because of their coarse spatial resolution, satellites are not capable of resolving fine-scale urban variation.For instance, modelled NO 2 VCDs at the same spatial footprint as the Ozone Monitoring Instrument (OMI; Levelt et al., 2006) over North American megacities were found to have a 20-30 % negative bias when compared to fine-scale models (Kim et al., 2016).
Data from satellites have previously been used in NO 2 LUR models over large geographic regions.For instance, average surface concentrations derived from tropospheric NO 2 VCDs measured by OMI have been used as predictor variables to estimate annual NO 2 concentrations over the United States (Novotny et al., 2011), Western Europe (Vienneau et al., 2013), andAustralia (Knibbs et al., 2014).OMI tropospheric VCDs have also successfully been used directly without deriving a surface concentration to model the annual NO 2 concentration over the Netherlands (Hoek et al., 2015).In all cases the inclusion of OMI data as a predictor variable resulted in good agreement with in situ measurements, and improved predictive performance when compared with equivalent LUR models which did not include OMI data.
The aforementioned examples can only provide timeaveraged concentrations -and so may be sensitive to daily variations in NO 2 caused by changes in local meteorology or emission sources.Daily satellite measurements may contain useful information about both of these effects, and so could be applied to address this issue.Lee and Koutrakis (2014) used a mixed-effects model to address this issue.In this LUR model, the OMI tropospheric VCD was included with both a fixed and random effects.Fixed effects representing parameters temperature and wind speed were also included, along with land use terms such as population density and devel-oped area.The LUR model was found to have high predictive capability (R 2 = 0.79) when used to estimate daily NO 2 concentrations over the New England region of the USA.
A similar mixed-effects approach could potentially be used to predict NO 2 concentrations over China.Because of limited data availability there have been few exposure assessment studies of Chinese air quality.A LUR model would allow for daily high-resolution maps to be developed for such studies.The objective of this work is to therefore create and validate a LUR model for forecasting surface NO 2 concentrations over Hong Kong, and to assess its utility.

Method
For this work surface NO 2 concentrations were measured and forecasted over the Hong Kong SAR between 2005 and 2015.This time period was chosen as a compromise between ensuring adequate representation of seasonal cycles and the availability and quality of the satellite data (see below).

In situ data
The LUR models used in this work were both calibrated and validated by surface NO 2 concentrations measured by in situ stations from the Hong Kong Air Quality Network (HK-AQN).These stations are maintained by the Hong Kong Environmental Protection Department (HKEPD, 2007).Between 2005 and 2015 11 monitoring stations measuring ambient pollutant concentrations were in operation (see Fig. 1).These stations provide hourly measurements of CO, SO 2 , O 3 , NO x , NO 2 , and particulate matter.NO 2 concentrations are measured through a combination of chemiluminescence and differential optical absorption spectroscopy (DOAS; Platt and Stutz, 2006).These stations are placed on buildings, away from traffic junctions, and so are thought to be representative of ambient conditions.Of these stations, 10 are located in developed regions while one (Tap Mun) is located in the Sai Kung Country Park, and so can be considered a rural background station.Throughout the study period, the HKEPD have reported that the precision and accuracy of the NO 2 measurements have been within the ±20 % control limit.
The number and spatial sampling of these in situ stations is smaller than those typically chosen for LUR modelling (Hoek et al., 2008).However, it is not entirely without precedent, as Li et al. (2010) used 14 in situ stations from the local regulatory monitoring network in their LUR model to predict NO 2 concentrations over Jinan, China.Therefore, it may be possible to model an equivalent Chinese megacity using a similarly limited in situ network.

Satellite data
Between 2005 and 2015 there were three satellite instruments measuring tropospheric NO 2 VCDs: OMI, GOME-2A, and SCIAMACHY.These instruments and their retrieval algorithms are briefly summarized in this section.All retrieval algorithms based on these instruments derive these VCDs by first retrieving a total slant column density (SCD) from the measured visible (400-500 nm) reflectance spectrum using the DOAS technique.The stratospheric component of the total column is then separated, either by empirical estimation based on unpolluted regions (e.g.Richter and Burrows, 2002) or by model assimilation (e.g.Boersma et al., 2004).In addition to this, the column is also weighted by an air mass factor (AMF) (Palmer et al., 2001) calculated from a priori information to account for biases resulting from scenespecific features (e.g.viewing geometry, scene albedo, NO 2 vertical profile).
Tropospheric NO 2 VCDs from these instruments were previously verified over China and Japan between 2006 and 2011 using ground-based multi-axis DOAS (MAX-DOAS) measurements by Irie et al. (2012).It was found that the biases between these instruments and the MAX-DOAS observations were small enough to be considered insignificant, suggesting that data from these instruments could be combined for use in air quality studies.
Because of their varying ground pixel sizes, all satellite data products used in this work were reprojected onto a 0.01 • grid.To avoid biases from cloud contamination, only ground pixels where the reported cloud fraction was < 30 % were used from all instruments.For scanning instruments, only pixels observed during forward scans were used.

Ozone monitoring instrument (OMI)
The Dutch-Finnish Ozone Monitoring Instrument (OMI, Levelt et al., 2006) has been in continuous operation since 2004.OMI offers daily global coverage, with a local equatorial overpass time of approximately 13:45.The instrument images a 2600 km swath binned to 60 across-track pixels, with a nadir ground pixel size of 13 km × 24 km.While this pixel size allows for city-scale features to be resolved, the pixel size increases considerably away from the nadir, as OMI is a pushbroom spectrometer.To try and compensate for this effect in this work, the ground pixels are weighted by their size and cloud fraction when gridded using the method detailed in Wenig et al. (2008).
Since 2007 OMI has also been affected by a partial blockage of its entrance aperture.This obstruction has resulted in the so-called "row anomaly", in which the measured radiances are systematically biased depending on the acrosstrack viewing angle, season, and latitude.At the time of this work this anomaly affects roughly half of the 60 across-track pixels, which are removed from the analysis.
For this work the OMI tropospheric VCDs were taken from the NASA Standard Product (OMNO2, v 3.0; OMNO2 Team, 2016).In this product the global stratospheric NO 2 field is estimated by interpolating over known unpolluted regions and then subtracted from the total column (Bucsela et al., 2013).Further information about the SCD fit and the www.atmos-chem-phys.net/17/8211/2017/Atmos.Chem.Phys., 17, 8211-8230, 2017 AMF computation can be found in Marchenko et al. (2015) and Bucsela et al. (2013).

Global Ozone Monitoring Experiment-2 (GOME-2A)
The Global Ozone Monitoring Experiment-2A (GOME-2A, Callies et al., 2000)  For this work the GOME-2A tropospheric VCDs were taken from the TEMIS TM4NO2A product (v 2.3;Boersma et al., 2004).In this product the total SCD is assimilated into the TM4 chemical transport model (CTM) to obtain the stratospheric column.Further information about the SCD fit and the AMF computation can be found in TEMIS (2010).

SCanning Imaging Absorption spectroMeter for
Atmospheric CHartographY (SCIAMACHY) The SCanning Imaging Absorption spectroMeter for Atmospheric CHartographY (SCIAMACHY, Bovensmann et al., 1999) was in operation between 2002 and 2012.SCIA-MACHY used both limb and nadir viewing geometries to provide columnar and profile information.However, because of this unique design global coverage was only achieved every 6 days.SCIAMACHY had a local equatorial overpass time of 10:00.Like GOME-2A, SCIAMACHY employed a scanning mirror to image a 960 km swath, which allowed for a constant ground pixel size of 60 km × 30 km.For this work the SCIAMACHY tropospheric VCDs were also taken from the TEMIS TM4NO2A product (v 2.3; Boersma et al., 2004).This dataset was chosen so as to minimise potential biases between the satellite datasets caused by differences in their retrieval algorithms.

Mixed effects land use regression model
LUR models are typically fixed effect models, in which the concentration of a pollutant is expressed as the linear sum of variables approximating the influence of various emission sources and sinks.These variables are "fixed" in the sense that they are temporally invariant, and apply to the mean atmospheric state over the entire observation period.As a result, traditional LUR models are sensitive to unobserved heterogeneity arising from temporal variability in emissions or other ambient conditions.In this work, an additional variable is required to cover time-dependent effects (so-called "ran-dom" effects) in order to model daily NO 2 concentrations.In practice, time-dependent effects are modelled in linear regression through the inclusion of a discrete "dummy" variable to describe a property of the data, such as the in situ station where particular measurement was made.These effects are considered to be "random", as the magnitude and/or sign of the effect is not expected to be the same over all measurements.A model combining both fixed and random effects is therefore known as a "mixed-effects" model, in which the concentration is expressed as the sum of fixed variables along with other variables whose effects vary with time or other properties classified by the dummy variables.In this work, these models are fitted from the observation dataset using the lme4 R software package (Bates et al., 2012).
The mixed-effects LUR models considered in this work are similar to the one developed by Lee and Koutrakis (2014).The daily ambient NO 2 concentration at a location i on day j is assumed to be a linear function of the gridded daily satellite tropospheric NO 2 VCD retrieved over the same location, ij : This approach accounts for day-to-day variations in the surface NO 2 / ratio, while also reducing the influence of days with insufficient in situ or satellite data.
In Eq. (1) α and u j are the fixed and random intercepts, respectively, while β 1 and v j are the fixed and random slopes of ij , respectively.The β m are the fixed slopes of additional predictor variables X ij m at point i and day j .The error term of the model is represented by, ij (u j v j ) ∼ N (0, σ 2 ), while represents the variance-covariance relationship for the day-specific random effects.
The main source of spatiotemporal information in the model is the NO 2 • relationship derived from the in situ and satellite measurements, while the other parameters are used to give a local context for probable NO x emission sources and sinks.The fixed terms in Eq. ( 1) represent the spatial average of the NO 2 • relationship, while the random terms model the day-specific variations.The day-specific relationship may be the consequence of daily variations in the NO 2 vertical profile caused by changes in boundary layer height, emissions, or other influences.For this work the daily mean NO 2 concentration from each of the stations shown in Fig. 1 was used as the dependent variable in Eq. ( 1).These concentrations were log-transformed to ensure that the input dataset was normally distributed.
As this is a purely empirical model, the modelled surface NO 2 concentration is primarily a function of the in situ and satellite data used to train it.Therefore, surface concentrations are only modelled for a particular day, j , if at least one in situ station has ≥ 75 % of the expected hourly measurements and a cloud-free satellite observation on that day.

Spatial predictor variables
As in traditional LUR models, spatial predictor variables in this work are selected from a number of proxies describing the local meteorology and NO 2 emission sources and sinks.These are summarized in Table 1 and discussed herein.Variables describing sources and sinks at a given location were also buffered using several circle radii: 100,200,300,400,500,600,700,800,1000,1200,1500,1800,2000,2500,3000,3500,4000,5000,6000,7000,8000, and 10 000 m.In all, this gave a total of 139 distinct variables to be presented to the model.Certain variables were also given fixed signs that β m must have.For instance, terms representing emission sources must have positive β m terms to represent the positive effect they have on the ambient NO 2 concentration, while variables such as vegetation cover and surface elevation would have a negative β m .At the time of this work no traffic density information for Hong Kong was available, so in order to estimate the possible contribution from traffic emissions it was thought that the total road length within a buffer radius would be a viable substitute.Road lengths were calculated from the OpenStreetMap dataset (Haklay and Weber, 2008).The road lengths of primary, secondary, and tertiary roads were considered as separate variables to account for the average difference in traffic density experienced by these road types.The coastline from the OpenStreetMap dataset was also used to calculate the distance to the sea for a given point, in order to simulate the possible influence of cleaner marine air and/or shipping emissions on the ambient NO 2 concentration.
Residential NO 2 emissions were thought to scale linearly with population density, which has been sourced from the WorldPop 2010 population density dataset (Stevens et al., 2015).The total population density within a buffer was calculated for a given point.
Urban area coverage was also assumed to be a good indicator of residential and industrial emissions.At the time of this work the highest resolution land cover dataset available over Hong Kong was the 0.5 km MODIS-based Global Land Cover Climatology (Broxton et al., 2014).The total vegetation cover (i.e.land covered by any vegetation type) was also used to simulate the effect of dry deposition on the ambient NO 2 concentration.Both vegetation and urban cover were calculated as a percentage of the buffer area.
In addition to fixed spatial parameters Lee and Koutrakis (2014) also suggested using meteorological data in the model to further explain the spatiotemporal variation in the surface NO 2 field.For instance, surface temperature can be assumed to be a proxy for the actinic flux, and so the photochemical rate of dissociation of NO 2 into NO, while wind speed can be used as a proxy for the effect of advection on local concentrations.For this work the daily mean surface temperature, wind speed, and wind direction sourced from the ERA-Interim reanalysis dataset (Dee et al., 2011) were used as predictor variables.

Predictor variable selection
To determine the optimal combination of predictor variables to be used in the LUR model, a robust stepwise regression approach similar to the one employed by Eeftens et al. (2012) was used.First, univariate regression was applied to all predictor variables.The predictor variable with the highest adjusted R 2 was included in Eq. ( 1) as the first X m .The remaining variables are then consecutively added to the model, and their effect on the model-adjusted R 2 was noted.After all other variables are considered, the predictor variable that resulted in the largest increase in the adjusted R 2 was kept, provided that the following criteria are met: (1) the increase in the model-adjusted R 2 was greater than 1 %, (2) the sign of the predictor variable coefficient conformed to the sign shown in Table 1, and (3) the signs of the other predictor variables already included in the model were not changed by the inclusion of the considered predictor variable.
Predictor variables were added to the model until the model-adjusted R 2 no longer increased by > 1 %.The pvalues of each predictor variable were then calculated, with statistically insignificant variables (i.e.p > 0.05) sequentially removed from the model until all predictor variables became statistically significant.The multicollinearity of the remaining predictor variables was then assessed by calculating the variance inflation factor (VIF) for each one.Predictor variables where VIF > 10 were sequentially removed from the model to determine their influence on the model predictive power.
The models developed for this work were also tested for influential observations by calculating the Cook's D for each surface NO 2 measurement.Observations where the Cook's D was > 1 would be removed from the analysis and their effect on the model performance would have been assessed.How-ever, in this work no such observations were detected over any of the stations involved.

Model variants
Daily forecasts of surface NO 2 will be affected by the diurnal and seasonal cycles that affect transport and production.Because of their different revisit times, data from the satellite instruments have previously been combined to yield information about these cycles (e.g.Boersma et al., 2008;Hilboll et al., 2013).Therefore, it may be possible to enhance the model predictive power by using observations by multiple satellite instruments at the same time and location.Equation (1) can therefore be adapted to include random and fixed slopes and intercepts for each satellite instrument.For instance, a model combining SCIAMACHY and OMI data would be (2) In this case the fixed and random slopes of now represent the average and day-specific NO 2 • relationship as observed by each instrument, which may allow for the diurnal cycle to be better represented in the model.As with single-instrument models, only days with both in situ data and cloud-free observations from both satellite instruments can be modelled with this approach.
Additionally, previous studies (e.g.Beelen et al., 2013) used separate LUR models to account for seasonality in surface NO 2 concentrations.While the use of daily satellite data should help to account for this effect, over short timescales the systematic difference between seasons may not be immediately recognizable and may lead to a poor model fit.
For this work several models were developed to explore these concepts, which are summarized in Table 2. Model 1 is a reference against all other models are compared against, as the OMI dataset is the temporally longest with minimal issues from spatial sampling or cloud cover.Model 2 attempts to account for the seasonal cycle by training two LUR models looking at different months for all years: winter (November-April) and summer (May-October).Several LUR models are also trained to investigate the predictive utility of each satellite instrument separately.In addition to this, several models based on Eq. (2) were assessed, trialling different combinations of satellite instruments in order to better account for diurnal variations in NO 2 .Finally, a multiple linear regression model without using satellite data or mixed effects, while forcing temperature and wind speed as predictor variables, was also assessed as a reference to compare against the other models.
Other models based on those listed in Table 2 were also tested, but are not included in this work due to anomalous results.A seasonal model similar to Model 2 was tested with GOME-2A and SCIAMACHY data, but in both cases the fixed satellite data slope was found to be statistically insignificant in the winter season.It is likely that this result was due to both instruments lacking an adequate number of winter measurements over Hong Kong because of their comparatively large ground pixel size and limited coverage.

Results and discussion
The properties of each of the models (predictor variables, adjusted R 2 ) discussed in Table 2 are summarized in Table 3. Comparisons between these models may be biased by the number of observations used to produce each model, owing to the difference in mission lifetimes and ground pixel sizes.Additionally, models combining OMI and SCIAMACHY data always failed to converge, regardless of the predictor variables included.This null result may be due to a lack of cloud-free days when both instruments were coincident over Hong Kong.Despite this, it is clear that models including satellite data have superior predictive performance as compared with the reference model.
Figure 2  By contrast, the raw OMI data do not adequately resolve any of these features, showing only a single enhancement over Bao'an which declines radially with distance.This discrepancy is likely to be the consequence of poor spatial sampling and the comparatively higher emissions from Shenzhen dominating the observed NO 2 column.The difference in detail between these two datasets shows the potential utility in downscaling coarse satellite data with mixed-effects LUR models to better resolve emission sources and spatial distribution.

Model intercomparison
Figure 3 shows the mean surface NO 2 concentration predicted by all models between 2007 and 2012, which was a time period common to all of them.Because of differences in instrument spatial resolution and ground coverage, only 38 days in this time period were found to have cloud-free measurements by all three satellite instruments.As a compromise, Fig. 3 shows the mean of all data predicted by each model.
Over the Hong Kong SAR, all models show clear enhancements over the areas already noted in Fig. 2. The models also all predict a negative longitudinal gradient; concentrations predicted by the models over Lantau South Country Park (22.24 • N, 113.93 • E) were on average 2.6 times higher than those over Sai Kung Country Park (22.40 • N, 114.35 • E).This gradient may potentially be the result of in situ station coverage; the most eastern station (Tap Mun) is situated in the Sai Kung Country Park, while the most western station (Tung Chung) is within a residential area and nearby Hong Kong International Airport.
The distribution of elevated NO 2 concentrations over the Hong Kong SAR does not significantly change between models, though the longitudinal gradient is more pronounced in some models than others.In models 2-8 the gradient is strong enough to result in mean surface NO 2 concentrations predicted over Lantau South Country Park to be ∼ 40 µg m −3 .These values seem unrealistic, as the MODIS and WorldPop datasets suggest that the region is mostly uninhabited and undeveloped compared to districts like Aberdeen and Yantian, which show similar concentrations.
Outside of Hong Kong, the distribution of the Bao'an and Shenzhen enhancements change considerably between models, depending on whether road networks or population density and urban area coverage were used.Because of a lack of available surface concentration data from mainland China, these regions cannot be validated in this work.

Seasonal variation
All models including satellite data were found to predict higher surface NO 2 concentrations during the winter than in the summer, particularly over urban areas.This seasonal dependence may be caused by lower boundary layer height and longer NO x lifetime during winter, as well as increased emissions from residential heating.Figure 4 shows this seasonal gradient in the mean surface NO 2 concentration during 2005-2015 predicted by models 1 and 2 over both seasons.
Both models in Fig. 4 are highly correlated in the summer (R 2 = 0.97), as they are largely based on the same variables, though Model 2 does not feature a longitudinal gradi-  2 for 2007-2012.Each plot also shows the number of cloud-free days from which the models could be trained.ent.However, in winter the models are much less correlated (R 2 = 0.78), with Model 2 showing a much stronger longitudinal gradient than Model 1.As in Figure 3, this gradient leads to unphysically high concentrations being predicted over uninhabited regions such as Lantau South Country Park, making it unlikely that this is a realistic model of winter air quality over Hong Kong.
From Table 3 it is clear that the winter model had over 1000 fewer observations to use compared to the summer model.During winter there are fewer cloud-free observations, which would lead to the model overfitting the data available.Despite having fewer observations to use the winter model-adjusted R 2 is higher than the summer model, suggesting that overfitting has occurred.The spatial footprint size of GOME-2A and SCIAMACHY are much larger than OMI, which would result in fewer cloud-free observations being available in the same time period, and so lead to the null results observed when seasonal models involving these datasets were attempted.

Cross-validation with in situ data
Based on the adjusted R 2 values for each model shown in Table 3, it appears that models 6 and 8 are the best performing models, suggesting that using more than one satellite dataset improves model prediction.However, the adjusted R 2 statistic may be artificially inflated by overfitting to the input data, and so may be overly optimistic descriptors of model performance.Ideally these models would be validated against additional measured concentrations from stations independent of the current dataset.However, in the absence of other stations measuring ambient NO 2 , the LUR models in this work were validated using cross-validation (CV), in which subsets of the data used to initially train the model are iteratively removed from the training process and used to compare against the model forecast.
LUR models are typically validated using two major CV approaches: leave-one-out cross-validation (LOOCV) and kfold cross-validation.LOOCV involves data from a particular station being reserved from the model training process and used to validate the model, such that data from any one station are validated against a model trained using data from every other station.Conversely, k-fold cross-validation involves randomly partitioning the data into k equal-sized subsets (i.e. from all stations), and then using each subset to validate the model trained using the remaining k − 1 subsets.Because of the limited number of stations available for this work, removing entire stations from the training dataset would remove significant information from the model training process, and so unfairly bias the validation results.The limitations of LOOCV compared to k-fold cross-validation when applied to LUR models based on limited in situ data have previously been discussed in Wang et al. (2016) and Johnson et al. (2010).Because of the limited number of in situ stations available, this work used a 5-fold CV approach to validate the models, in which 80 % of the available data is used to calculate the coefficients and intercepts of each of the models shown in Table 2.These models are then used to estimate the surface concentrations of the remaining 20 % of the data.This process is repeated until every data point has been estimated by a model that has not been trained using it.
In this work the predictive performance of each model is determined through comparing the cross-validated model dataset against the original in situ measurements through linear regression.Agreement between the two datasets is quantified by calculating the adjusted R 2 , gradient, intercept (referred to henceforth as the model bias), and root mean square error (RMSE, µg m −3 ).Because the models developed in this work are purely statistical, the CV gradient and bias against the in situ data are considered to be the main measures of model accuracy in this work.The RMSE of a model was calculated as the square root of the mean of the squared errors.Table 4 shows the results of the cross-validation on each of the models considered in this work.
From considering the CV-adjusted R 2 and RMSE, it is clear that all the models including satellite data perform better than Model 9, suggesting that there is some utility in incorporating satellite data in LUR models.Model 2 has the highest CV-adjusted R 2 and lowest RMSE, suggesting that OMI data offered the best agreement with in situ measure-ments, so long as seasonal effects are accounted for.Sources of error reflected by the RMSE in models 1-8 may be from coarse spatial sampling by the satellite instrument, or retrieval algorithm errors in the satellite dataset.
Models using only one satellite dataset also perform better than those combining two or more datasets.A likely cause behind this difference is that the models using more than one satellite dataset had fewer cloud-free observations to use, because of complications arising from different spatial resolutions and orbital coverage.A lack of available data would have therefore resulted in these models overfitting the input data available.

Spatial representivity
For all models in this work the CV dataset can be grouped by station, which allows for side-by-side comparisons of model performance over all regions to be made.Figure 5 shows the CV-adjusted R 2 and RMSE for each model over each station.It is clear from the CV that with the exception of Tap Mun, models 1-4 agree much better with the in situ data overall compared to models 5-9. Figure 5 also shows that models 1-4 also on average have much lower RMSEs over most stations excluding Tap Mun, which suggests that they offer a higher precision than models 5-9.
However, over Tap Mun almost all models (excluding Model 2) perform poorly, with lower adjusted R 2 values and Table 4.The results of the 5-fold cross-validation (CV) applied to all the LUR models described in Table 2. Surface concentrations estimated using CV were compared against the original in situ measurements using linear regression, from which the adjusted R 2 , gradient, bias, and RMSE (µg m −3 ) are derived.The standard error of the gradient and bias are also displayed, while the RMSE is also expressed as a percentage of the mean concentration estimated by the model.RMSE values that are higher than the mean of the other stations.This result suggests that the models all have poor spatial representivity over unpopulated areas, which is because such regions are largely unrepresented by the in situ stations.Model 2 is somewhat of an outlier to this trend, as over Tap Mun the CV-adjusted R 2 is 0.73, which is comparable to values retrieved over the other stations.Similarly, the CV RMSE retrieved over Tap Mun is also lower than the value retrieved by Model 1.This suggests that training a seasonspecific model may better account for variability between rural and urban areas.

Temporal representivity
The CV datasets produced in this work can also be grouped and validated by year to determine whether annual or decadal changes in NO 2 are successfully predicted by models trained with all available data.The inclusion of satellite data as a predictor variable also raises the possibility of instrument degradation affecting model performance.Unlike in situ stations, satellite instruments can only be passively recalibrated over their lifetime, leading to a possible drift in retrieval precision that may progressively bias surface NO 2 models (e.g.Dikty, S. and Richter, A., 2011;Anand et al., 2015).
The LUR models are affected by the number of observations available, which in turn are also dependent on instrument degradation.One example of this is the OMI row anomaly, which since 2007 has grown to affect half of the instrument orbital coverage.Over time, this would lead to fewer available observations, which may lead to biases in the LUR models.The degradation in available measurements, combined with the potential decrease in precision of the DOAS fit over time, may result in a decline in the annual CV-adjusted R 2 and a corresponding rise in the CV RMSE because of the increased uncertainty in the model.
Table 5 shows the annual CV-adjusted R 2 and RMSE of models 1 and 2 between 2005 and 2015.While no statistically significant trend is observed in the CV-adjusted R 2 values for either model, both models show a statistically sig-nificant decline in RMSE over time (Model 1: −0.28 % yr −1 ; Model 2: −0.11 % yr −1 ), which suggests that coverage losses or instrument degradation are not significant influences on model accuracy or precision.Table 5 also shows that on average the adjusted R 2 of Model 2 is ∼ 8.0 % higher than Model 1, while the RMSE is ∼ 23 % lower, suggesting that the better performance Model 2 showed in Table 5 compared to Model 1 was not the result of anomalously high correlation with in situ measurements over certain years.

Influence of local meteorology
For models 1-8, the inclusion of temperature and wind speed from ERA-Interim was not found to significantly improve the adjusted R 2 compared to the other considered variables.One possible reason for this may be that the spatial resolution of the ERA-Interim is too coarse to capture the true variation in temperature and wind speed.Another possibility is that the satellite data implicitly contain information about ambient atmospheric conditions observed as part of the VCD measurement, so additional meteorological data may not be needed in the LUR model.
In order to determine whether meteorological data substantially improve the LUR model, Model 1 was trained again while forcing surface temperature and wind speed from ERA-Interim as predictor variables.The training process again selected the same variables shown in Table 3, with the addition of the total tertiary road length within 400 m.Wind speed and temperature were found to have a negative effect on surface concentration; the ERA-Interim temperature may represent the ambient actinic flux, while high wind speeds would increase mixing and therefore act to lower concentrations.Figure 6 shows the seasonal average surface NO 2 concentration predicted by Model 1 with and without meteorological data for 2005-2015.The addition of meteorological data causes a ∼ 17 % mean increase in surface NO 2 concentrations across the region, though no new emission sources are visible.2.
As with the other models, this model variant can be validated against the in situ measurement data using 5-fold CV and compared with the results in Table 4.When meteorological data were forced the CV-adjusted R 2 was 0.806, compared with 0.775 before, suggesting that the inclusion improves the model agreement.Similarly, the model CV RMSE decreased to 12.0 µg m −3 (22.1 %) after including meteorological data.The CV gradient also decreased to 0.846, while the CV bias became 7.17 µg m −3 .The decrease in gradient and increase in bias against in situ data suggests that the inclusion of ERA-Interim data does not adequately improve the LUR model accuracy, though the increase in CV-adjusted R 2 and decrease in RMSE shows that it does improve the precision of the model.
For this work it is thought that the effect of meteorological data in the LUR model is limited by the spatial resolution of the satellite instruments, or the ERA-Interim dataset.Previous LUR models incorporating daily meteorological data (e.g.Su et al., 2008;Lee and Koutrakis, 2014) have typically used measurements from weather stations either close to or at the sites where the NO 2 concentrations have been measured, with the ambient temperature and wind field therefore interpolated from these fixed points.Because of the comparatively fewer number of NO 2 stations available for this work, it was thought that a harmonized dataset like ERA-Interim Table 5.The adjusted R 2 and RMSE (µg m −3 ) determined from the 5-fold cross-validation (CV) applied to models 1 and 2 (see Tables 2  and 4), grouped by year.would reduce the spatial uncertainty otherwise introduced by discrete weather stations.Future iterations of this work should investigate whether using in situ weather data would provide a better outcome.

Validation using OMI and MACC-II reanalysis data
An alternative technique to deriving surface NO 2 concentrations from satellite measurements is to use a chemical transport model to estimate the vertical profile at the time of the satellite overpass (Lamsal et al., 2008).The profile can then be used to partition the tropospheric VCD into its surface and free-tropospheric components, thereby estimating a scaling factor that can be applied to the measured VCDs.This approach is advantageous in that it allows for surface NO 2 concentrations to be mapped at a higher spatial resolution than many CTM grids.For this work a similar approach to Lamsal et al. ( 2008) was used to infer surface NO 2 concentrations from OMI data.Daily mean NO 2 vertical profiles over Hong Kong were sampled from the MACC-II reanalysis dataset (Monitoring At- (3) Here, the terms G and S G are the tropospheric VCD and the surface concentration derived from the MACC-II daily average profile, for which the surface is defined as the lowest layer of the profile (20 m).To obtain the tropospheric VCD the profile is integrated up to the tropopause height taken from the OMNO2 dataset.The modelled free-tropospheric NO 2 column, F G , is taken to be horizontally invariant over the MACC-II grid cell, in order to represent the longer NO x lifetime in the free troposphere.As the spatial resolution of the MACC-II dataset is much larger than the OMI nadir resolution (1.125 • × 1.125 • ), the S/ conversion factor is weighted by an additional term, ν, which is defined as the ratio of the local OMI tropospheric VCD to the mean OMI field over the MACC-II grid cell.
MACC-inferred surface concentrations were calculated for all cloud-free OMI pixels measured over Hong Kong between 2005 and 2012 and compared against the daily ambient NO 2 concentrations recorded at the in situ stations.Figure 7 shows the mean surface NO 2 concentration estimated using MACC-II and OMI data for winter and sum-mer over Hong Kong.Compared to Fig. 4, it is clear that the MACC-inferred concentrations are much lower and capture much less spatial information than the LUR models, because of limitations caused by the OMI spatial resolution.Over both seasons, NO 2 concentrations appear to peak north of the Hong Kong SAR, potentially caused by emissions from Shenzhen and Bao'an, or transported further north from the Pearl River Delta.
Because of this lack of spatial detail, the MACC-II concentrations correlate very poorly with the in situ data (R 2 = 0.11, RMSE = 41.9 µg m −3 ), with a linear gradient of ∼ 0.58.This analysis was repeated with MACC-II profiles modelled at 14:00 local time (the closest available time to the daily OMI overpass), with similarly poor agreement.As well as this, previous comparisons of tropospheric NO 2 VCDs inferred from MACC-II profiles with SCIAMACHY data over East Asia suggest that the dataset underestimates tropospheric NO 2 by a factor of 2 in winter (Inness et al., 2013), which may also partially explain the lack of agreement with the in situ data.It is clear from this result that the mixed-effects LUR model offers better spatial resolution and predictive capability than the MACC-II reanalysis over Hong Kong.

Time series analysis
The Model 1 dataset covers a decade of near-continuous measurements, from which it may be possible to determine whether NO 2 concentrations have significantly changed after accounting for noise and seasonal variation.To determine whether a statistically significant trend can be observed from this dataset, surface concentrations modelled over Kowloon and Hong Kong Island (see Fig. 1) were binned to monthly averages between 2005 and 2015.Following Hilboll et al. (2013), a linear trend with a seasonal component was fitted to this time series.The surface concentration at month t (Y (t), where t = 0 is January 2005), was modelled as a combination of a fixed intercept µ and linear trend ω: The time series may be subject to variations in the seasonal component caused by changes in emissions and NO x lifetime.To reflect this, an additional term, ξ , is introduced to Eq. ( 4) to dampen or drive the seasonal oscillation over time.The term N (t) represents the noise component (i.e. the remaining signal in the time series that cannot be explained by the model) Equation ( 4) is first solved using nonlinear regression to determine the values of µ, ω, and ξ that minimize N (t).The seasonal components have a negligible impact on the estimation of the other parameters in Eq. ( 4) (Weatherhead et al., 1998), so these are subtracted from the time series.In addition to this, the autocorrelations are also accounted for using a linear matrix transformation.Finally, linear regression is applied to determine µ and ω (Mieruch et al., 2008).
In order to determine the linear trend error, it is assumed that the noise N (t) is autoregressive with lag 1 (AR(1)).Following the approach defined by Mieruch et al. (2008), the linear trend is considered to be statistically significant only if the following condition is satisfied: where erf(x) is the Gauss error function.
The monthly average time series and the fitted model are shown in Fig. 8, along with an annual bottom-up NO x emission inventory estimated by the HKEPD (HKEPD, 2014).The linear trend was estimated to be: −0.0208 µg m −3 yr −1 (−0.430 % yr −1 relative to the average 2005 concentra-tion).The seasonal dampening term ξ was estimated to be: −0.0287 µg m −3 yr −1 .However, the trend was found to be statistically insignificant.This analysis was repeated on the raw OMI tropospheric VCDs observed over the region, which resulted in a statistically insignificant trend of −2.52 % yr −1 .A similar result was found when analysing satellite data between 1996 and 2012 over Hong Kong by Hilboll et al. (2013), who also found that the signs of µ and ξ were the same.Another investigation by Schneider et al. (2015) using only SCIAMACHY data also found a statistically insignificant negative trend, as well as a statistically significant trend of −3.8 % yr −1 over Shenzhen.
A statistically insignificant negative trend was also estimated when this analysis was repeated using data predicted by Model 2 (−0.537 % yr −1 ), as well as the spatial mean concentration reported by the in situ stations in this region (−0.240% yr −1 ).By contrast, the HKEPD inventory shows a statistically significant trend of −1.60 % yr −1 .A possible reason behind this discrepancy could be influence from NO x emissions transported from mainland China which may obscure any decline in local emissions.The coarse OMI spatial resolution can also cause a smoothing of sub-pixel plumes over urban areas, and so the resulting retrieved column may be an underestimate of the true value (Kim et al., 2016), which would therefore result in a negative bias in the modelled surface concentrations.

Conclusions
The Hong Kong SAR is subject to high ambient NO 2 concentrations caused by a combination of local emissions and pollution transported from elsewhere in the Pearl River Delta.Exposure studies require the calculation of accurate surface concentration maps, which could be enhanced by the synoptic coverage offered by satellite instruments.For this work several mixed-effects LUR models were developed to explore this concept, which combined in situ NO 2 measurements with tropospheric VCDs measured by satellite instruments.Despite a limited number of in situ stations, the mixed-effects models incorporating satellite data were found to have superior predictive performance in estimating daily ambient NO 2 concentrations over the region compared to the reference model, with an average CV-adjusted R 2 of 0.681.
The LUR models used high spatial resolution datasets such as road networks and MODIS land cover to simulate likely emission sources.This allowed for distinct features to be visible over districts such as Kowloon, Yantian, and Wan Chai (∼ 100 µg m −3 ).By contrast, local minima were observed over uninhabited areas such as the Sai Kung and Plower Cove Country Parks (∼ 5 µg m −3 ).One anomaly to this trend was the Lantau South Country Park, which was modelled to have ambient NO 2 concentrations as high as 40 µg m −3 .This enhancement may be the result of pollution from the nearby Hong Kong International Airport, or an artefact caused by the location of the Tung Chung station.The spatial features and relative intensities of these polluted regions appear very similar to the NO 2 concentrations derived by Lee et al. (2017), who used a LUR model based on a far greater number of in situ measurements, but did not incorporate satellite data or random effects.This similarity demonstrates that a viable LUR model of a densely populated, heterogeneous landscape can be derived from a small set of in situ stations using satellite data.Very large features were also observed over Shenzhen and Bao'an, though validating these is beyond the scope of this work due to insufficient station coverage.
For this work several models were developed to assess the relative utility of OMI, SCIAMACHY, and GOME-2A data as predictor variables.The quality of these datasets differs significantly because of their temporal sampling and spatial resolution.From 5-fold cross-validation with the in situ data it was found that OMI data gave the best agreement with the in situ data, so long as seasonal effects were accounted for (CV-adjusted R 2 = 0.838).OMI has the smallest ground pixel size and the longest temporal range of the three instruments, which allowed for local emissions and the seasonal cycle to be better accounted for.Larger ground pixel sizes are at risk of contamination by pollution transported from Shenzhen or elsewhere in the PRD, which may add a positive bias to all inferred surface concentrations over Hong Kong.
It was thought that the models including more than one satellite dataset would have improved sensitivity to diurnal variation, and so predict daily average surface concentrations better than models using a single dataset.However, as with all statistical models, the LUR model performance is dependent on the number of observations available, and can only predict day-specific surface NO 2 concentrations when both satellite and in situ data are available on that day.As only cloud-free satellite data can be used, the number of available observations is therefore heavily dependent on the season and the spatial resolution of the satellite instrument (Krijger et al., 2007).Factoring diurnal changes in cloud cover, this means that models using more than one satellite instrument would be fitted using fewer observations than single instrument models.Because of these issues and differences in spatial resolution, it was difficult to determine whether diurnal cycle coverage was accounted for by these models.
By collating cross-validation model data by in situ station and time it was possible to gauge the spatiotemporal representivity of each model.For models using only OMI data no significant negative trend in the CV-adjusted R 2 was found between 2005 and 2015, suggesting that these models can account for the progressive loss of coverage caused by the row anomaly, allowing for high temporal representivity over the entire observation period.
The single-instrument models generally performed better than the multiple-instrument and reference models over all regions except for the rural Tap Mun station, where all models apart from the seasonal OMI model performed poorly.Tap Mun is the only rural station in the HK-AQN, which may have resulted in the models being biased in favour of highly polluting urban areas.One example of this bias is the longitudinal gradient present in most of the models, which is especially notable in Fig. 4. The longitudinal gradient has resulted in unrealistically high concentrations being reported over the uninhabited Lantau South Country Park, which raises concerns over the true spatial representivity of the models over regions where no in situ data are available.Future iterations of this work may require a more diverse in situ network and/or higher resolution satellite data to better capture the spatial gradient between polluted and unpolluted regions.
For this work temperature and wind information from the ERA-Interim reanalysis dataset was provided in the model training process, in order to simulate photochemical loss and mixing.However, it was found that including these variables did not significantly improve the model-adjusted R 2 compared with other parameters used in this work, and so were not selected by the model training process.When temperature and wind speed were forced into Model 1, the average NO 2 concentration over the region increased by ∼ 17 %, though no new features were observed.Cross-validation with the in situ data suggests that while including ERA-Interim data improves model precision, the model accuracy falls.One possible cause of this decrease in accuracy may be that the spatial resolution of ERA-Interim was too coarse to fully represent the true atmospheric state.The model performance may potentially be improved if in situ measurements from a dense network of weather stations could be used instead.
Time series analysis was applied to surface concentrations predicted by the OMI-only models to determine whether a trend in emissions over Kowloon and Hong Kong Island could be determined between 2005 and 2015.Both models and the OMI data over this region reported a statistically insignificant trend over this region (−0.430% yr −1 for Model 1).By contrast, the HKEPD annual bottom-up NO x inventory suggests that a statistically significant trend of −1.60 % yr −1 should be observed during this period.Emissions transported from elsewhere in the PRD may have offset any observable decline in local emissions, though this would require accurate information of pollution outside of Hong Kong to verify.That said, the influence of mainland Chinese emissions on Hong Kong air quality has previously been investigated and quantified by Wang et al. (2017) and Xue et al. (2014) using more refined models, which supports the conclusion reached in this work.
In the absence of additional in situ data, surface NO 2 concentrations were also estimated from OMI data using profiles from the MACC-II reanalysis dataset.However, surface concentration maps derived using this method had the same spatial resolution as OMI, and so were dominated by pollution transported from Shenzhen or further afield.As well as this, the MACC-II dataset has previously been shown to have poor agreement with other satellite datasets over East Asia (Inness et al., 2013), which may also affect the accuracy of this method.Because of these issues, agreement with in situ data was very poor (R 2 = 0.111) compared with the models used in this work.It is likely that better estimates could have been achieved with higher spatial resolution CTMs, such as the Models-3 Community Multiscale Air Quality (CMAQ; Kuhlmann et al., 2015).
For the first time, this work has demonstrated the potential in combining in situ data with satellite data with a mixedeffects model to obtain better estimates of daily surface NO 2 concentrations over a small, densely populated region.This approach can be readily applied to other megacities so long as a diverse in situ monitoring network exists to calibrate and validate the model.Despite the limited number of in situ stations available for this work, the mixed-effects model produces reliable high-resolution mapping of surface NO 2 that remains robust over long timescales.As well as this, this work also attempted for the first time to account for diurnal variation using only observations and a statistical approach, but was severely limited by differences in the spatiotemporal resolution of the satellite datasets.
However, the spatial resolution of the satellite instrument remains a source of error, which may lead to underestimating the true surface concentration over megacities.In the future, the performance of this model would be greatly improved by the inclusion of higher resolution satellite data from forthcoming missions such as Sentinel-5P (7 × 7 km; Veefkind et al., 2012).Accounting for diurnal cycle variability in daily estimates may also still be possible by combining daily measurements made by instruments with similar spatial resolutions (e.g.Geostationary Environmental Monitoring Spectrometer, GEMS; Kim, 2012).Further improvements could also be made by the inclusion of spatiotemporal emission data, such as traffic volumes or emission inventories.However, such datasets would need to have a high spatial resolution comparable to the fixed parameters used in this work in order to have a significant influence on the model.Data availability.Monthly averages of the Model 1 data are provided as netCDF files at http://emep.int/panda/wp2/HongKongSAR.zip.
Competing interests.The authors declare that they have no conflict of interest.
Acknowledgements.The research leading to these results has received funding from the European Union Seventh Framework Programme ([FP7/2007([FP7/ -2013]]) under grant agreement no.606719, as part of the PArtnership with ChiNa on space DAta (PANDA) project.Additional funding was also provided by the UK National Environmental Research Council (NERC) under grant no.NE/N006941/1, as part of An Integrated Study of AIR Pollution PROcesses in Beijing (AIRPRO).
We acknowledge the use of OMI data made available from the NASA MIRADOR service (http://disc.sci.gsfc.

Figure 1 .
Figure 1.The in situ NO 2 stations from the HK-AQN used in this work.The red line indicates the international boundary of the Hong Kong SAR, which this work focuses on.The green rectangle represents the Kowloon district and Hong Kong Island, from which the time series in Sect.3.8 was derived.

Figure 2 .
Figure 2. Comparison of the mean surface NO 2 concentration estimated by Model 1 (left) and the mean tropospheric VCD measured by OMI (right) during the period 2005-2015.Grey regions indicate regions beyond the scope of the model -oceans and areas where no cloud-free satellite measurements were available during this period.Important areas are indicated.
shows the mean surface NO 2 concentration during 2005-2015 as predicted by Model 1, compared to the mean OMI tropospheric NO 2 VCD observed during the same period.The Model 1 output shows clear enhancements over known residential areas, with the densely populated districts of Kowloon, Wai Chung, and Kwai Chung showing concentrations > 100 µg m −3 .Additional enhancements are also visible over Hong Kong International Airport, and industrial parks such as Yantian.Conversely, unpopulated regions such as the Plower Cove and Sai Kung Country Parks show very low concentrations (∼ 5 µg m −3 ).The spatial distribution and relative intensity of the polluted regions is visually similar to the concentrations forecasted by the LUR model developed by Lee et al. (2017), which did not incorporate satellite data or random effects, but made use of far more in situ sites (95) than the 11 used in this work.Outside of the Hong Kong SAR, significant enhancements are also found over Shenzhen and Bao'an, which likely reflect the high population density and manufacturing industries located there.

Figure 3 .
Figure 3.The mean surface NO 2 concentration predicted by each of the models listed in Table2for 2007-2012.Each plot also shows the number of cloud-free days from which the models could be trained.

Figure 4 .
Figure 4.The mean surface NO 2 concentration predicted by models 1 and 2 during winter (November-April) and summer (May-October) between 2005 and 2015.

Figure 5 .
Figure5.The CV-adjusted R 2 and RMSE for each of the HK-AQN stations used in this work, as reported by the models listed in Table2.

Figure 6 .
Figure 6.The mean surface NO 2 concentration predicted by Model 1 and 2 during winter (November-April) and summer (May-October) between 2005 and 2015, with and without the inclusion of wind speed and temperature from the ERA-Interim reanalysis dataset (Dee et al., 2011).

Figure 7 .
Figure 7.The mean surface NO 2 concentration inferred from OMI tropospheric VCDs using MACC-II reanalysis data, between 2005 and 2012.Data is plotted for winter (left, November-April) and summer (right, May-October).

Figure 8 .
Figure 8. Top panel: time series analysis of the monthly mean surface NO 2 between 2005 and 2015 predicted by Model 1 (see Table 2) over the region covering Kowloon and Hong Kong Island shown in Fig. 1.The error bars represent the standard error of the mean for each month, while the red line represents the linear trend and seasonal cycle modelled using Eq.(4).The linear trend is also shown separately as the blue dashed line.Bottom panel: the annual total NO x emissions by Hong Kong, as estimated by the HKEPD bottom-up inventory (HKEPD, 2014).
Figure 8. Top panel: time series analysis of the monthly mean surface NO 2 between 2005 and 2015 predicted by Model 1 (see Table 2) over the region covering Kowloon and Hong Kong Island shown in Fig. 1.The error bars represent the standard error of the mean for each month, while the red line represents the linear trend and seasonal cycle modelled using Eq.(4).The linear trend is also shown separately as the blue dashed line.Bottom panel: the annual total NO x emissions by Hong Kong, as estimated by the HKEPD bottom-up inventory (HKEPD, 2014).

Table 1 .
The predictor variables (X m ) considered for the LUR model (Eq. 1) used in this work.

Table 2 .
The LUR models considered in this work, showing the time period and satellite instruments used.Note that model 9 is a multiple linear regression model which does not include satellite data or random effects.

Table 3 .
Description of the LUR models shown in Table2, showing the predictor variables (including buffer radii where applicable) and adjusted R 2 .Models combining OMI and SCIAMACHY data failed to converge regardless of predictor variable, so no viable dataset was produced for Model 7.