Observing local CO2 sources using low-cost, near-surface urban monitors

Urban carbon dioxide comprises the largest fraction of anthropogenic greenhouse gas emissions but quantifying urban emissions at subnational scales is highly challenging, as numerous emission sources reside in close proximity within each topographically intricate urban dome. In attempting to better understand each individual source’s contribution to the 10 overall emission budget, there exists a large gap between activity-based emission inventories and observational constraints on integrated, regional emission estimates. Here we leverage urban CO2 observations from the BErkeley Atmospheric CO2 Observation Network (BEACO2N) to enhance, rather than average across or cancel out, our sensitivity to these hyperlocal emission sources. We utilize a method for isolating the local component of a CO2 signal that accentuates the observed intraurban heterogeneity and thereby increases sensitivity to mobile emissions from specific highway segments. We demonstrate a 15 multiple linear regression analysis technique that accounts for boundary layer and wind effects and allows for the detection of changes in traffic emissions on scale with anticipated changes in vehicle fuel economy–an unprecedented level of sensitivity for low-cost sensor technologies. The ability to represent trends of policy-relevant magnitudes with a low-cost sensor network has important implications for future applications of this approach, whether as a supplement to sparser existing reference networks or as a substitute in areas where fewer resources are available. 20


Introduction
Initiatives to curb greenhouse gas emissions and thereby reduce the extent of climate-change-related damages are gaining momentum from city to global scales (United Nations, 2015).To support this effort, there is a clear need for monitoring strategies capable of describing emission changes and attributing those changes to the relevant policy measures (Pacala et al., 2010).Currently, an estimated 70 %-80 % of global CO 2 emissions are urban in origin, and this fraction is expected to grow as migration to urban areas continues and intensifies with the industrialization of developing nations (United Nations, 2011).However, cities also present the largest atmospheric monitoring challenge in that many disparate emission sources combine with complex topography.
A considerable amount of emission estimation work has been invested in the development of activity-based emission inventories for selected metropolitan areas, such as Indianapolis (Gurney et al., 2012), Paris (Bréon et al., 2015), Los Angeles (Newman et al., 2016), Salt Lake City (Patarasuk et al., 2016), and Toronto (Pugliese et al., 2018), as well as other inventories constructed and maintained by individual air management agencies for internal use.These inventories, when updated regularly, offer the possibility of direct source attribution without the use of computationally intense and/or heavily parameterized atmospheric transport models; they do, however, typically rely on interpolations, generalizations, or proxies to generate the necessary input activity data.The Fuel-based Inventory for Vehicle Emissions (FIVE) developed by McDonald et al. (2014), for example, uses a representative 7 days of highway traffic flow mea-Published by Copernicus Publications on behalf of the European Geosciences Union.
surements to drive the weekly cycle of CO 2 emissions from mobile sources on roads of all sizes year-round.While traffic patterns and residential and commercial energy usage are known to vary by day of week (Harley et al., 2005), the specific timing and magnitude of these variations are likely to be heterogeneous in space and time.Mobile emission estimates constructed using an average week of highway observations therefore neglect the impact of anomalous events as well as the variety of vehicle fleets, commute practices, and congestion patterns that occur at the neighborhood level.As knowledge of emission factors and fuel efficiency grows, activity data will become one of the largest sources of uncertainty in bottom-up inventory products.
Ambient atmospheric measurements offer the opportunity to observe nuanced variations in CO 2 emission activities directly without generalizing across space and time.In order to document baseline conditions in and upcoming changes to urban greenhouse gas emissions, surfacelevel monitoring campaigns in cities using varied approaches are being pursued (e.g., Bréon et al., 2015;Chen et al., 2016;McKain et al., 2012McKain et al., , 2015;;Shusterman et al., 2016;Turnbull et al., 2015;and Verhulst et al., 2017).These networks, typically consisting of 2-15 instruments, attempt to constrain and supplement activity-based emission inventories with observation-based estimates.Most previous work on observation-based emission estimates has focused on domain-wide emission totals over monthly to annual timescales (e.g., Kort et al., 2013).This emphasis on integrated signals has led to site selection and data analysis techniques that minimize sensitivity to local emissions, thus discarding a large portion of the information contained in the datasets collected at individual measurement sites and the differences between them (Shusterman et al., 2016;Turner et al., 2016).
We hypothesize that, if trends in the specific small-scale CO 2 sources implicated in most mitigation strategies are to be resolved from atmospheric monitoring datasets, site-tosite heterogeneity must be sought out and retained.Here we present an initial characterization of the degree of spatial heterogeneity present in an urban monitoring dataset and offer these direct observations of intracity heterogeneities as a possible strategy for providing direct constraints on CO 2 emissions from individual sectors.We provide an initial approach to quantifying changes in the mobile sector and separating the influence of that sector from other emissions.

The BErkeley Atmospheric CO 2 Observation Network
The BErkeley Atmospheric CO 2 Observation Network (BEACO 2 N; see Shusterman et al., 2016)  A description of the design, deployment, and evaluation of the BEACO 2 N approach can be found in Shusterman et al. (2016) and Kim et al. (2018).
Here we utilize CO 2 observations from the 20 BEACO 2 N sites operating most consistently during the summer and/or winter of 2017 (Table 1), defined as 1 June 2017 through 30 September 2017 and 1 November 2017 through 31 January 2018, respectively.The raw 2 s CO 2 concentrations are averaged to 1 min means, which are subsequently converted to bias-corrected dry-air mole fractions using site-specific meteorological observations and in-network reference measurements (see Shusterman et al., 2016).The processed 1 min averages are assumed to have an instrumental uncertainty of less than ±4 ppm.The longer averaging timescales used hereafter reduce the error of the mean (e.g., ±1.8 ppm at 5 min resolution, ±0.5 ppm at hourly resolution, ±0.06 ppm for a given hour of the day over an entire season), although the concomitant increase in the influence of atmospheric variability cannot be quantified.Any long-term drift in the sensors is accounted for via a combination of periodic (i.e., every 12-24 months) laboratory recalibration and a post hoc data treatment based on the supersite situated within the network domain.This procedure allows us to confidently compare measurements taken multiple years apart, thus enabling interannual changes in CO 2 -related phenomena to be moni- tored.The exact details of the calibration and post hoc data treatment are provided in Shusterman et al. (2016).

Traffic counts
Traffic count data are collected by the California Department of Transportation as part of the Caltrans Performance Measurement System (PeMS; http://pems.dot.ca.gov/, last access: 23 September 2018).Hourly passenger vehicle flow data (in vehicles per hour) are obtained from the road monitors nearest to the relevant BEACO 2 N site with > 50 % directly observed (as opposed to modeled) data and are summed across all lanes and directions.Due to limited data coverage, in some cases it is necessary to sample road monitors upstream or downstream of the desired roadway segment; here we assume the sampled traffic conditions to be reasonable approximations of those on the desired segment.
The specific monitor IDs used in each analysis are given in Table 1.

Results & discussion
To quantify the spatial heterogeneity present across the network, we examine the degree of correlation between every possible pairing of sites in a given season as a function of the distance between them, borrowing from a similar analysis used by McKain et al. (2012).For straightforward comparison with the McKain et al. results, we first average the total CO 2 mole fractions to 5 min resolution.Then, for every pairwise combination of two sites, we perform an ordinary least squares linear regression between the two 5 min time series and calculate the Pearson correlation coefficient.We repeat this procedure after offsetting the two time series by ±5 min, ±10 min, etc., allowing for up to a ±3 h lag, and choose the optimal r 2 value from the possible offsets.We plot the thusoptimized pairwise correlations as a function of the distance separating the two relevant sites (Figs. 2 and 3) and fit the results to a single term exponential decay on top of a con- stant background, defined by the mean correlation observed at inter-site distances greater than 20 km.
In the summer months, there appears to be some relationship between the proximity of the sites and the correlation of their observations at all hours, with higher correlations between neighboring sites decaying into more modest, but still significant, correlations at longer inter-site distances.The characteristic length scale of this correlation is 2.9 km (defined as the e-folding distance of the exponential fits in Fig. 2; 3.6 km during the day and 2.2 km at night), which we interpret as an indicator of the distance at which various emission sources exert influence over a site's measurements.Shorter correlation lengths indicate sensitivity to near-field emissions, while longer correlation lengths imply sensitivity to far-field phenomena.
The winter months exhibit lower pairwise correlations overall and shorter correlation lengths relative to the summertime (2.4 km; 2.6 km during the day and 2.1 km at night).Some portion of the summer-winter differences may be attributable to seasonal differences in dominant wind patterns, although this effect is difficult to disentangle from the slightly different collection of sites sampled during the two seasons; the winter sample, for example, contains fewer pairs with separation lengths less than 5 km, which affects the perceived overall trend.In either season, the correlation lengths are, as expected, considerably longer than the previously observed ∼ 100-1000 m e-folding distances of reactive urban pollutants that are also lost via chemical pathways (e.g., Zhu et al., 2006;Beckerman et al., 2008;Choi et al., 2014), thus validating the original choice of 2 km as the desirable intersite separation in the design of the BEACO 2 N instrument.
The 24 h findings (top panels of Figs. 2 and 3) compare well to those presented by McKain et al., who also documented a decaying but nevertheless persistent correlation with increasing site separation.However, McKain et al. saw very little correlation after restricting their analysis to daytime hours, even at very short (< 5 km) inter-site distances, which implies that daytime observations reflect hyperlocal phenomena only.In contrast, we observe moderate to high correlations during the day, which illustrates that information about emissions and transport phenomena on a variety of scales is preserved.A spatial visualization of the daytime correlation coefficients at four representative winter sites is shown in Fig. 4. We see that PER is well correlated with its neighbors only, suggesting the presence of local phenomena that do not affect other parts of the network.LCC, however, also exhibits relationships with more distant sites, indicating a sensitivity to more regional-scale (10-30 km) influences.Meanwhile, HRS and OHS each possess at least one near neighbor with whom they are poorly correlated, perhaps due to hyperlocal events specific to those sites.While the region-wide phenomena can be characterized using sparser networks of high-cost, conventional monitoring equipment, the ability to capture these local processes is unique to the high-density approach.
We posit that the true strength of a high-density, surfacelevel monitoring network lies in its characterization of hyperlocal phenomena unique to a given site or subset of sites.In order to directly examine signals attributable to these specific local CO 2 emission processes, we separate each site's observations into a "regional" and "local" component.The regional component is, by definition, the same at all sites network-wide, calculated from the bottom 10th percentile of all BEACO 2 N readings collected during the surrounding 1 h window.The bottom 10th percentile is chosen (rather than the absolute minimum) to account for measurement error (±4 ppm at 1 min resolution; see Shusterman et al., 2016) as well as any near-field drawdown from the local biosphere; negative values in the local signals are likely attributable to some combination of these effects.While many different sites contribute to this bottom 10th percentile over the course of the data record, some sites located in close proximity to emission sources are never represented in the bottom 10th percentile and always exhibit some enhancement (i.e., a nonzero local component) over the regional background signal.The regional component is allowed to vary throughout the data record and will therefore reflect domain-wide changes in response to day of week, synoptic weather events, etc.
The diel profiles of the regional signal measured in summer and winter 2017 are shown in Fig. 5, reflecting the typical convolution of background concentrations, emission processes, and dynamics experienced across the entire BEACO 2 N domain.In both seasons, we see an increase in  the regional signal beginning around 04:00 local time (LT), followed by a decrease in concentrations at 08:00 LT in the winter months and 11:00 LT in the summer, and another increase in early to late afternoon, depending on the season.This diurnal profile corresponds well with known patterns in traffic emissions -which are largely consistent across seasons -superimposed on diel fluctuations in boundary layer height and/or biosphere activity that vary in timing and magnitude according to the season.Namely, these results might be interpreted to conclude that the nighttime boundary layer in the BEACO 2 N domain is shallower during the winter months, producing a larger regional increase in response to rush hour traffic.The wintertime layer also appears to expand and re-contract earlier in the day than the summertime layer, resulting in both an earlier minimum and an earlier rise in afternoon-evening concentrations.The larger amplitude of the wintertime diurnal cycle may also reflect the greater influence of daytime photosynthesis and nighttime respiration during the San Francisco Bay Area's rainy winter season.An analysis of the regional signals calculated for similar periods in 2013 revealed qualitatively similar results (Fig. S1 in the Supplement), although it should be noted that the 2013 analysis uses observations from a significantly different subset of sites in the BEACO 2 N network.
We isolate the local signals by subtracting the networkwide regional component from the data record at each site.and S3, confirming a much greater frequency of high CO 2 concentrations during the winter months.In both seasons, the distribution of the local enhancements is typically unimodal with a heavy righthand tail, although some sites exhibit more complex bi-or multi-modal distributions.
By definition, we expect these local signals to represent a unique combination of emission sources and atmospheric dynamics specific to a given site.Here we endeavor to determine whether measurements of local CO 2 enhancements can be used to monitor a single urban emission source, despite the complex landscape of CO 2 sources and sinks present within the study domain.We choose to focus on mobile CO 2 emissions as these are estimated to comprise approximately 40 % of the San Francisco Bay Area's annual CO 2 emissions (Claire et al., 2015).This is the largest source sector in the CO 2 emission inventory and likely to represent an even larger fraction within the urban core, where the next-largest source sectors (industrial/commercial and electricity/co-generation) are less abundant.However, as noted in the discussion of the regional signals above, direct observation of the magnitude and variation of traffic emissions via ambient CO 2 concentrations is complicated by the coincident variation in turbulent mixing and boundary layer height as the earth's surface warms and cools at sunrise and sunset (Fig. S4).
In order to more directly examine the relationship between highway traffic flow and urban CO 2 concentrations, we begin by analyzing the subset of observations collected between 04:00 and 08:00 LT at the LAN site, located less than 40 m from Interstate 880.During this period, traffic emissions are high, but the boundary layer is relatively shallow, thus increasing the sensitivity of the surface-level monitor to the traffic signal.The resultant strong positive correlation between rush hour traffic flow and local CO 2 concentrations is shown in Fig. 6.An alternative analysis using traffic density -obtained by dividing the traffic flow by the average vehicle speed -yields almost identical results (Fig. S5), revealing a factor-of-2 increase in local CO 2 mole fraction enhancements during congestion (high traffic flow/density) relative to free-flowing conditions (low traffic flow/density), similar to that observed by a previous on-road mobile monitoring study by Maness et al. (2015).Also shown in Fig. 6 are the median CO 2 concentrations observed in each 500 vehicles h −1 traffic flow increment and the ordinary least squares linear regression through these binned medians.
In addition to this first-order sensitivity to vehicle emissions at the near-roadway LAN site, we find that relatively subtle emission changes can also be detected using nodes stationed greater distances from the highway by controlling for the confounding impacts of dispersion and the biosphere.To do so, we decompose the CO 2 signals into terms that represent the influence of meteorology (which is correlated with both dispersion and biosphere activity) and emissions separately via a multiple-linear-regression (MLR) approach analogous to that described by de Foy (2018).Briefly, we use an ordinary least squares linear regression to calculate the best fit of the relationship between a site's CO 2 signal and temperature, specific humidity, wind, boundary layer height, time of day, day of week, and time of year.Hourly measurements of temperature, specific humidity, wind speed, and wind direction are taken from a single NOAA Integrated Surface Database weather station at the Port of Oakland International Airport (https://www.ncdc.noaa.gov/isd/,last access: 23 September 2018), and 3 h boundary layer heights are provided at 0.125 • by 0.125 • resolution by the ECMWF's ERA-Interim model (Dee et al., 2011; http://apps.ecmwf.int/datasets/, last access: 23 September 2018).Although the low spatiotemporal resolution of these datasets limits their ability to capture hyperlocal meteorologies, here we follow the example of de Foy, who was nonetheless able to derive meaningful results from similarly coarse weather products.
The nonlinear relationship between CO 2 concentrations and wind or boundary layer height is captured by dividing these meteorological datasets into quartiles and assigning each observation a value between 0 (at the maximum of the quartile) and 1 (at the minimum) using piecewise linear interpolation.The wind speed quartiles are further subdivided by wind direction and reassigned values of 0-1 accordingly before fitting a linear coefficient to each subset.The time of year is represented as a sum of sines and cosines with annual or semiannual periodicities whose values also vary between 0 and 1 and whose amplitudes are determined by the linear regression.Zeroes and ones are used to designate each hour of each type of day of the week as well.For example, time steps corresponding to 08:00 LT on a Monday may be assigned a 1 while all other time steps are set to zero before the linear regression is performed.As a result, the MLR factors derived for each of the preceding explanatory variables can be interpreted in units of ppm CO 2 .Meanwhile, the temperature and specific humidity variables are treated by calculating their difference from their mean values and dividing by their respective standard deviations before each is fit to CO 2 with a single linear coefficient, which will have units of ppm K −1 and ppm (kg water kg −1 air ) −1 , respectively.The independent variable leading to the greatest square of the Pearson correlation coefficient is then combined with each of the remaining variables, and a second regression is performed.The two-input combination leading to the largest increase in the correlation coefficient is then combined with each of the remaining variables, and so on, until the addition of a new independent variable no longer increases the r 2 value by at least 0.005.
For this analysis, we use hourly total CO 2 concentrations (the sum of the local and regional components) measured at five sites between 15 February 2017 and 15 February 2018.For each site, the optimal set of explanatory variables and their relative contributions to the correlation coefficient are given in Table 2. Summing the products of each of the MLR factors with their respective independent variables (e.g., time of day, wind speed) gives the mixing ratio predicted by the MLR model; a representative week of observed and modeled CO 2 concentrations is shown in Fig. 7.We find generally good agreement, with some significant hour-by-hour model-observation differences, especially at RFS.These do not, however, appear to be systematic either in sign or in tim- ing (e.g., the rush hour peak in CO 2 may be poorly modeled on one day but well predicted on another).
Multiple-linear-regression coefficients are derived for each hour of the day during five types of days of the week (Mondays, Tuesdays through Thursdays, Fridays, Saturdays, and Sundays); for clarity, Fig. 8 shows the regression coefficients for Tuesdays through Thursdays and Sundays.Other days of the week are shown in Fig. S6.These MLR "factors" signify the average CO 2 enhancement or depletion (in ppm) uniquely associated with a particular hour of a particular day of the week.The dependencies on time of day and day of week derived via this method are hypothesized to primarily reflect the changes in emissions, as the influence of the coincident changes in atmospheric dynamics has been at least partially controlled for.For reference, the corresponding Tuesday-Thursday and Sunday diel cycles in the total CO 2 observed at each site are shown in Fig. 9. Indeed, we do observe some of the same intuitive patterns in the linear regression coefficients, such as higher coefficients on weekday mornings corresponding to higher rush hour traffic emissions on those days, but with greater opportunity to differentiate between days of the week, especially around noon, when raw concentrations are generally similar.As expected, the Tuesday-Thursday enhancement in the MLR factors is larger at the sites located close to a freeway (e.g., up to 520 % higher than the corresponding Sunday MLR factor at FTK) but is less pronounced at LBL (70 %), which is farther away from major mobile sources.For reference, the 1 km by 1 km FIVE mobile emission inventory developed for the San Francisco Bay Area by McDonald et al. (2014) predicts a ∼ 210 % weekday enhancement on average, peaking around 05:00 LT, much earlier in the day than is observed here.  .Representative week of total CO 2 concentrations observed (thick gray curve) and modeled (dashed blue curve) at five sites using a multiple-linear-regression approach based on de Foy (2018).
When we examine the relationship between these multiple-linear-regression coefficients and morning traffic flow as we did at LAN (Fig. 10), we again find positive correlations.This is an interesting result, given that the traffic flow measured on a single highway likely provides only a first-order approximation of the total traffic emissions influencing a single CO 2 monitor, especially those situated at greater distances from said highway, which may be sensitive to additional highways, as well as local roads.Although the predominance of a single highway's emissions (or at least its correlation with those from other sources) is not a necessary condition of our MLR analysis, the strong positive correlations we observe suggest that this methodology may nonetheless be useful in monitoring emissions from individual highways such as these.
The standard error of the slope of the linear regression is calculated as the standard deviation of the model-data CO 2 residuals divided by the square root of the sum of the squared differences between each traffic flow increment and the mean traffic flow.The 1σ uncertainty in the slopes (i.e., the 68 % confidence interval, assuming a Gaussian error distribution) is thus found to be 11 %-30 %, indicating that analysis of a single site could be used to detect as small as 11 % changes in average emissions per vehicle, an improvement upon the 17 % slope uncertainty calculated for the near-highway LAN site.For reference, under the Corporate Average Fuel Economy standards, the state of California aims to achieve a fleetwide average fuel economy of 23.2 km L −1 by the year 2025 (US EPA, 2012), corresponding to a 35 % decrease in emissions relative to the 15.1 km L −1 economy of 2012-2016 model year vehicles.Assuming a steady decrease in emissions of 3.5 % yr −1 , an 11 % decrease would be achieved after approximately 3 years, showing that one BEACO 2 N site is therefore sufficiently sensitive to detect such a trend with 68 % confidence in as little as 3 years.By leveraging multiple independent sites, even greater confidence and/or shorter timescales could be achieved.
It is likely that sensitivity could be further enhanced with more accurate meteorological datasets.While the single weather station and relatively coarse (0.125 • by 0.125 • ) reanalysis product we use here may be adequate to represent the meteorological conditions across some domains, the San Francisco Bay Area is at the high end of complexity in terms of terrain and microclimatology.Higher-resolution boundary layer heights and neighborhood-specific wind observations may improve the results of our multiple linear regression, but these types of measurements are rarely available on the spatial scale of the BEACO 2 N instrument and are difficult  to simulate with accuracy (Jiménez et al., 2013;Banks et al., 2016).In future work, high-density networks like BEACO 2 N may therefore be useful not just in source attribution but also in providing a much-needed observational constraint on our understanding of near-surface transport.
Future work will also make use of the ancillary datasets provided by the BEACO 2 N platform, such as the concurrent NO x and CO concentrations.Prior studies have demonstrated a methodology for detecting plume-like events in the BEACO 2 N NO x and CO observations (Kim et al., 2018), and the ratio of these species to CO 2 provides a unique signature for each different CO 2 source (e.g., Ban-Weiss et al., 2008;Harley et al., 2005;Lopez et al., 2013;Nathan et al., 2018;Turnbull et al., 2015), allowing subsets of the data record to be directly attributed to specific (e.g., mobile) source types and allowing the relationship between these specific activities and CO 2 mixing ratios to be derived more precisely.With such a precise methodology for converting between emissions and concentrations, subtler interannual trends in emissions could be detected, for example changes in vehicle emissions following construction of new housing.

Conclusions
We have described the heterogeneity measured at the individual sites of a high-density, surface-level urban CO 2 monitoring network.Network-wide correlation length scales are found to be slightly longer during daytime during the sum- mer and generally shorter during winter months, but they fall in the range of values reported previously based on other stationary observation networks and mobile monitoring campaigns.High near-field correlations are thought to be driven by shared sensitivity to local emission events, while moderate far-field correlations reflect regional episodes, suggesting that a given site's data record is likely a convolution of both phenomena.We therefore present a methodology for separating the observed CO 2 concentrations into local and regional components and observe distinct distributions (i.e., unimodal vs. bimodal) of local CO 2 enhancements within single neighborhoods.A clear relationship is seen between morning rush hour traffic counts and local CO 2 concentrations, allowing for the detection of changes in vehicle emissions within 3 years if those changes proceed at a rate consistent with policy objectives.
Most prior studies of urban CO 2 emissions (e.g., McKain et al., 2012;Kort et al., 2013;Wu et al., 2018) have favored sparser networks of high-quality instruments, finding this approach to be better suited for resolving trends in total regionwide emissions.It is hypothesized that the ideal monitoring strategy depends on the particular goals and location of a given application, with certain locales and emission sources necessitating high-cost, low-density instrumentation, complemented by other domains where low-cost, high-density platforms are more effective.The potential trade-offs between measurement quality and instrument quantity specific to the San Francisco Bay Area have been investigated previously using an ensemble of observing system simulations by Turner et al. (2016), who found BEACO 2 N-like observing systems to outperform smaller, higher-quality networks in estimating regional as well as more localized emission phenomena there.While Turner et al. saw significant benefits to achieving an hourly instrument precision of 1 ppm, further increases in measurement quality offered little advantage in constraining emissions, especially those from line and point sources.
This work thus provides an important data-based validation of the conclusions of Turner et al.'s theoretical analysis.Not only do we demonstrate the ability of low-cost sensors to sufficiently constrain policy-relevant trends in line source (i.e., highway traffic) emissions, but we do so without the use of computationally intense and heavily parameterized atmospheric transport models.Furthermore, we show that a multiple-linear-regression analysis allows the signature of highway traffic to be extracted from sites located throughout the network, enabling trends in mobile emissions to be quantified without specially situated roadside monitors.Although this approach requires real-time traffic count information that is not yet available at all locations, our finding is nonetheless an important result, as deriving and implementing a particular a priori network layout is a non-trivial task.Domain-specific transport patterns prevent the development of general principles of optimal sensor placement, and, even if ideal locations can be identified, cooperation from facilities in the area cannot be guaranteed.By establishing for the first time that an ad hoc, opportunistic sensor siting approach can nonetheless provide sensitivity to emission sources of interest, we thus improve the prospects for widespread adoption of distributed monitoring systems in the future.
Progress toward evaluating the capabilities and proper use of low-cost sensors has particular relevance for nations with rapidly developing economies, where CO 2 emissions are increasing much faster than the resources needed to monitor them by conventional means.Domestically, citizen science and environmental justice groups are also adopting these technologies (Snyder et al., 2013) as an economically accessible means of advocating for greater public health and ecological wellbeing.While the specific correlation lengths and emission estimates we derive here are unique to the San Francisco Bay Area domain, the sensor performance capabilities and data analysis techniques we outline provide guidance more generally to any future studies attempting to interpret similar datasets around the world.High-resolution surface networks enabled by low-cost technologies offer a unique opportunity to provide ground truth constraints on difficultto-model near-surface dynamics as well as on the individual CO 2 sources and sinks that comprise the strategic backbone of greenhouse gas mitigation regulation.

Figure 1 .
Figure 1.Map of BEACO 2 N node locations (black dots).Nodes used in this study are labeled.Map data ©2017 Google.

Figure 2 .
Figure 2. Optimal correlation coefficients for every possible pairing of summer 2017 sites as a function of their separation distance during all hours (a), daytime hours (11:00-18:00 LT, b), and nighttime hours (21:00-04:00 LT, c).Solid lines show exponential decay of the correlation coefficients.

Figure 3 .
Figure 3. Optimal correlation coefficients for every possible pairing of winter 2017 sites as a function of their separation distance during all hours (a), daytime hours (11:00-18:00 LT, b), and nighttime hours (2:100-04:00 LT, c).Solid lines show exponential decay of the correlation coefficients.

Figure 4 .
Figure 4. Optimal correlation coefficients representing networkwide correlation with 5 min mean total CO 2 concentrations at four representative sites during daytime hours (11:00-18:00 LT) of winter 2017.Yellow spot (r 2= 1) on each subplot shows the location of the site at which the correlation is examined.

Figure 5 .
Figure 5. Hourly median values of the network-wide, regional CO 2 signals calculated for summer (orange) and winter (blue) periods in 2017.Lighter colored curves indicate the standard error; note the difference in y scale.

Figure 6 .
Figure 6.Morning (04:00-08:00 LT) local summertime CO 2 concentrations at LAN shown as a function of nearby highway traffic flow.Darker points indicate the median CO 2 concentration observed in each 500 vehicles h −1 traffic flow increment; black solid line indicates the linear regression through the binned medians (equation given above plot), and gray dashed lines show the uncertainty in the regression slope.

Figure 8 .
Figure 8. Multiple-linear-regression coefficients for five sites derived for each hour of the day on Tuesdays through Thursdays (orange solid line) and Sundays (blue dashed line) between 15 February 2017 and 15 February 2018.

Figure 9 .
Figure 9. Hourly median CO 2 concentrations observed at five sites on Tuesdays through Thursdays (orange solid line) and Sundays (blue dashed line) between 15 February 2017 and 15 February 2018; lighter curves indicate the standard error in the medians.

Figure 10 .
Figure 10.Morning (04:00-08:00 LT) multiple-linear-regression coefficients shown as a function of summertime traffic flow; black solid lines indicate the linear regression through the MLR factors (equations given above each subplot), and gray dashed lines show the uncertainty in the regression slope.

Table 1 .
List of site geo-coordinates, relevant traffic monitor IDs, and approximate distances from a highway.

Table 2 .
Explanatory variables included in the multiple-linear-regression analysis of each site; values indicate the correlation coefficient increase achieved by subsequent inclusion of each variable.