On the distinctiveness of oceanic raindrop regimes

. Representation of the drop size distribution (DSD) of rainfall is a key element of characterizing precipitation in models and retrievals, with a functional form necessary to calculate the precipitation ﬂux and the drops’ interaction with radiation. With newly available oceanic disdrometer measurements, this study investigates the validity of commonly used DSDs, potentially useful a priori constraints for retrievals, and the forward model errors caused by DSD variability. These data are also compared to leading satellite-based estimates of oceanic DSDs. Forward model errors due to DSD variability are 5 shown to be signiﬁcant for both active and passive sensors. The modiﬁed gamma distribution is found to be generally adequate to describe rain DSDs, but may cause systematic errors for high latitude or stratocumulus rain retrievals; depending on the application, an exponential or generalized gamma function may be preferable for representing oceanic DSDs. An unsupervised classiﬁcation algorithm ﬁnds a variety of DSD shapes that differ from commonly used DSDs, but does not ﬁnd a singular set that best describes the global variability. Finally, DSD shapes are found to be not particularly distinctive of regional or 10 large-scale environments, but rather occur at varying frequencies over the global oceans. considering the DSDs regionally or say across SST regimes. The GMM-derived shapes are not particularly tied

some way (Testud et al., 2001), or if the separation of parameters in the MGD is either physically meaningful or outperformed by simpler methods (Williams et al., 2014;Tapiador et al., 2014).
The below equations will be referred to throughout the text as the generic MGD function (Eq. 1) and normalized gamma function (Eq. 3), with N (D) the number of drops per volume per size as a function of the drop diameter, D (with D given in mm and N (D) in mm −1 m −3 ). N 0 and N w are intercept parameters and µ is the shape parameter, though N w is a normalized 5 intercept parameter. D m is the mass-weighted mean diameter, the ratio of the fourth and third moments of the distribution (Eq. 2). Γ is the gamma function, ρ w is the density of water, and RW C is the rain water content in kg m −3 . 10 stratiform and convective rainfall is done in various ways and may differ depending on location. A little further, Dolan et al. (2018) argue for six dominant modes of DSDs globally, separated via principal component analysis but linked to meteorology and attendant microphysical regimes. As many studies of drop distributions are from land-based disdrometers and radars, DSD variability has been studied less over open ocean where a majority of global precipitation occurs, though advances are being made in this area (Thompson et al., 2018).

5
In remote sensing applications, one can attempt to solve for all, some, or none of the parameters that define a functional form such as Eq. 1, depending on the information content available. A normalized distribution such as Eq. 3 is used in many precipitation retrievals to separate the water content from the spectrum's shape. In that formulation with RWC separate, this leaves two free parameters to define the distribution since RWC is directly related to N w through D m . While passive-only retrievals may need to assume one of these parameters because of the limited signal available (Duncan et al., 2018), radar or 10 combined radar/radiometer retrievals may solve for these parameters in a constrained way (Munchak et al., 2012;Grecu et al., 2016). Precipitation retrievals thus handle the complexity of the DSD differently depending on their sensitivity, but necessarily using a predefined functional form to limit the inverse problem's degrees of freedom.
To investigate the distinctiveness of raindrop shape regimes over the global oceans, and how these regimes may impact retrievals both in terms of prior constraints and radiative transfer modeling, the study proceeds as follows. Data and methods Underpinning OceanRAIN is the ODM470 optical disdrometer, a sensor with sensitivity to hydrometeors of diameter 0.3 to 22 mm (Klepp, 2015). The disdrometer is deployed on the superstructure of ships in a package including a cup anemometer and a precipitation detector to activate the disdrometer. A wind vane turns the disdrometer to keep the optical path normal to the wind direction to minimize impacts of turbulence. Only data points marked as rain definite and with a probability of precipitation of 100% were used in the following analysis.

5
Simulated reflectivites from the ODM470 disdrometer have demonstrated high correlation and a near-zero bias when compared against co-located, vertically-oriented radar observations (see Fig. 6, Klepp et al. (2018)). In comparisons with co-located rain gauges, the optical disdrometer performs better in high wind speeds, as undercatch is a significant problem for traditional rain gauges that can result in underestimation of rainfall accumulation by 50% (Grossklaus et al., 1998;Klepp et al., 2018), though accumulations match within 2% for low wind speeds (Klepp, 2015). The ODM470 has been used in a variety of 10 conditions and shown no difference in accuracy between oceanic and continental cases (Bumke and Seltmann, 2011).
The robustness of disdrometer-derived DSD parameters (following Eq. 3) will depend somewhat on the parameter discussed and the type of rain. For instance, derived D m should be very robust for all but the very weakest rain rates as it is simply defined (Eq. 2) and requires no fitting. The accuracy of derived N w may be suspect for cases with high rain rates and a low D m value, as drops below the sensitivity threshold of 0.39 mm may constitute a non-negligible fraction of total drops, though this 15 depends on the type of rainfall and is an issue faced by all disdrometers (Thurai et al., 2017). The derived shape parameter, µ, is the least robust of the three as it depends on a curve fitting which may not be optimal for very light rain rates or spectra that do not conform to the expected general shape.

GPM Combined
The Global Precipitation Measurement (GPM; Hou et al. (2014)) Core Observatory holds two sensors designed to measure 20 precipitation: the GPM Microwave Imager (GMI) and the Dual-frequency Precipitation Radar (DPR). GMI is a passive microwave radiometer measuring from 10 to 190 GHz and the DPR is a phased array radar measuring at K U and K A bands (13.6 and 35.5 GHz, respectively). The dual frequencies of DPR set it apart from other satellite-borne sensors as far as the capacity to solve for the DSD. The GPM core satellite's combination of passive and active sensors provides sensitivity to a large range of precipitating hydrometeors, with information on their emission and scattering characteristics.

25
The GPM combined algorithm (Grecu et al., 2016) is a retrieval that uses data from both radar and radiometer to solve for profiles of hydrometeors that optimally fit the observations. As the GPM core observatory represents the best observational platform yet flown for measuring near-global precipitation, the combined retrieval from DPR and GMI is included in this study as the state of the art for calculating global DSD statistics. Via the same DSD formulation given in Eq. 3, the combined retrieval first uses the K U band reflectivities to solve for the D m profile. It then retrieves N w at a reduced vertical resolution to match 30 the K A band and deconvolved GMI brightness temperatures (T B s) using optimal estimation. The shape parameter is fixed at µ = 2 for all cases. For further details about this retrieval, see Grecu et al. (2016).
In this study, gridded level 3 GPM Combined data are used (Olson, 2017). This data set provides statistics of pixel-level derived DSD parameters from Eq. 3 at 5 • horizontal resolution. The values used in this study are from the lowest altitude bin and include oceanic pixels only so as to best match the ground-based data from OceanRAIN-M. Because GPM Combined receives most of its information content from DPR, the DSD parameters derived are representative of individual segments of the atmospheric column and not a column average, a key difference from passive-only retrievals. This is significant, as comparison with ground-based observations should be as close in altitude as possible, as DSDs will vary with altitude as evaporation, coalescence, collisions, or other processes modify the spectra (Williams, 2016). The 250 m vertical resolution of DPR means 5 that multiple observations exist below 1 km altitude, though some of these will be affected by ground clutter and so the lowest bin without clutter is chosen here. Note that the GPM Combined retrievals were performed at the native DPR pixel size, which has a 5 km horizontal resolution.

Gaussian Mixture Modeling
Gaussian Mixture Modeling (GMM) is an unsupervised, probabilistic classification technique that attempts to represent a data The Python package scikit-learn supplied the GMM code (Pedregosa et al., 2011). 15 GMM easily generalizes to a wide variety of data distributions and can thus identify structures in the data that might be missed by more traditional curve fitting methods. This frees the analysis from explicit assumption of a DSD shape such as Eq. 3. In the approach used here, the "dimensions" given to the GMM module are the size bins used by the OceanRAIN disdrometers and thus the input data are an array of approximately 90000 raining minutes with 60 size bins. These data are unchanged other than being normalized so that DSD "shape" variability in the data set is not weighted by the total number of 20 drops observed, and cut off at 60 size bins as very few drops over 5 mm are ever measured. Because the shapes are independent of the total number of drops, this is analogous to the normalized DSD approach typified by Eq. 3. GMM thus finds common shapes of the observed DSDs and determines the posterior probability of each data point (DSD for each raining minute) falling into each of the various classes. Each observed DSD is assigned to the GMM class for which it has the highest posterior probability. The resultant classes provide insight into dominant structures of the input data, with this approach exemplified in 25 Section 4.2.
The number of GMM classes is set a priori, with the degree of complexity described by the GMM decomposition dependent on the number of states set by the user. Determining an optimal value for N GM M is thus important but somewhat subjective because the desired level of complexity retained after the decomposition will vary for different applications. One method for estimating a suitable range for the number of classes is to use the Bayesian Information Criterion (BIC; Eq. 4). This metric the variability explained with the fewest possible classes. A plateau of BIC values versus K would signify no distinctly optimal K to describe the data's variability, but rather a range of solution spaces in which the addition of further states provides marginal additional complexity.

Disdrometer data
Viewing the OceanRAIN data all together can provide a sense of the variability in DSD populations over the world's oceans.
From the perspective of global retrievals, constraints on the DSD that depend on the location or environmental regime, rather than, say, partitioning stratiform and convective precipitation a priori, are useful for independent satellite-based products that do not ingest detailed model data, such as the operational retrievals for the GPM constellation radiometers (Kummerow et al., 10 2015). To this end, the derived parameters of Eq. 3 are given for all raining disdrometer observations in Fig. 1. As this is the DSD form most used in rainfall retrievals currently, it is presented here.
As seen in Fig. 1, the normalized gamma DSD parameters exhibit a wide range of variability that is not strongly tied to location. The strongest trend visible is that warmer ocean surfaces witness greater densities of drops, with the mean log 10 (N w ) increasing from about 3.5 to 4.0. This is roughly in line with the a priori N w used for rain by Mason et al. (2017) of 3.9e3, 15 or 3.59 in log space. It is noted that the distributions of D m and µ are not particularly Gaussian, with the means and medians separate, and N w only moderately Gaussian in log space.
It is stressed that OceanRAIN observations are not evenly distributed around the global oceans and thus the values seen are dependent on the sampling (i.e. where the ships sailed), so these values are not fully representative of each ocean latitude band.
As surface-based observations they do not provide information as to any vertical DSD variability, a topic that requires radar 20 observations (Williams, 2016). However, it is possible to pick out some meteorological regimes of interest from the derived DSD parameters in OceanRAIN. For instance, the ships' heavy sampling of Southern Hemisphere stratocumulus regions shows up in these plots as a regime characterized by a higher number of small drops and a more peaked distribution (seen in the 20 • S to 40 • S band in Fig. 1). From the perspective of satellite rainfall retrievals, such location-or cloud regime-dependent a priori constraints are much preferable to a global prior and useable within existing algorithms.

Comparison to GPM
As mentioned in Section 2.2, GPM is the best satellite-borne platform currently available for measuring DSD variability, and thus the best near-global observational DSD data set for comparison. To assess the similarity between GPM estimates and the in situ disdrometer measurements of OceanRAIN, here the retrieved results for N w and D m are compared, as the GPM retrieval assumes a constant µ value. To perform this comparison, histograms of level 3 GPM Combined data at 5 • resolution  were used, spanning 12 months from 2017. Due to the uneven sampling of the ship-borne disdrometers, the only GPM data included in the analysis are from months with valid OceanRAIN data points in each box and defined as ocean pixels by DPR.
No attempt was made to match observations exactly in space and time due to the difficulty of point-to-area comparisons with ship-borne data and GPM Loew et al., 2017).
The left panel of Fig. 2 shows histograms of derived D m from the disdrometers compared with GPM Combined, separated 5 by latitude. Given the limited sensitivity of DPR to small drops, it is unsurprising to note that OceanRAIN observes a wider distribution of D m that is most noticeably distinct from GPM results for small drops. Another key feature of these histograms is that while the maxima in D m distributions are relatively similar for the two data sets, OceanRAIN observes a much less peaked distribution with a longer tail for larger drops in most latitudes. In the 40 • to 60 • latitude bin for both hemispheres GPM has a more bimodal distribution. For all latitudes GPM exhibits a strong peak near D m = 1 mm.  The right panel of Fig. 2 follows the same format but for derived N w . The most striking aspect of these histograms is the strongly peaked distribution retrieved by GPM in all locations. In contrast, the disdrometers observe many cases with N w values an order of magnitude greater or smaller than those of the GPM distributions. This would appear to have two leading, plausible explanations. First, OceanRAIN is expected to observe more variability in the number of drops because it is a point measurement integrated over one minute and precipitation characteristics can vary widely over multiple kilometers, whereas 5 DPR has a 5 km footprint. Second, DSD retrieval from GPM is very much an under-constrained problem despite the unique capabilities of DPR. While the altitude mismatch between ground-based disdrometers and the GPM data at a few hundred meters altitude may cause some systematic differences, say due to some evaporation unseen by GPM, this does not explain the limited range of N w values retrieved by GPM. The strongly peaked N w distributions seem indicative of the significant influence of the a priori state on retrieval of N w , in addition to the limited sensitivity to small number concentrations dictated 10 by the instrument sensitivity of DPR.
4 Applicability of the modified gamma distribution

Overall behavior
Without applying any sorting methods or functional forms to the OceanRAIN data, it is worth viewing the data as a whole to see how closely the bulk behavior resembles the distributions commonly used in the literature. Figure 3 shows a two-dimensional probability density function (PDF) of drop diameter normalized by D m versus number concentration normalized by N w . This is a view of bulk behavior often used to justify usage of the MGD for precipitation (Bringi et al., 2003;Leinonen et al., 2012), as it permits visualization of in situ data points with the MGD for various µ values including the exponential DSD. Figure 3 indicates that much of the spectral power within OceanRAIN lies near the exponential (µ=0) line or near the lines with small shape parameters. This is consistent with the enduring popularity of exponential DSDs and the µ = 2 assumption of GPM 5 Combined. To examine the applicability of the normalized gamma distribution to observed ocean DSDs, we can compare the observed PDF ( Fig. 3) with the PDF generated after performing the 3-parameter MGD fit. This is shown in Fig. 4(a), with sample MGD curves given for extreme values of the shape parameter. The MGD-derived PDF overestimates the frequency of points near the exponential line and understandably displays less spread; blue areas indicate over-representation from the MGD fit, red areas 10 indicate under-representation from the MGD fit. As with comparison between the PDF and MGD curves in Fig. 3, this shows an underestimation of small drops at high number concentrations through virtue of being constrained by the MGD fit. To see if there is some regional dependence within the overall OceanRAIN PDF, Fig. 4(b) divides the data into observations from high latitude (latitudes greater than 50 • ) and tropical (latitudes less than 20 • ) locations. It appears that whereas the MGD with a shape parameter ranging from µ = 0 to µ = 3 suffices for many tropical cases, high latitude observations are not always well represented by this formalism. For high latitude oceanic rainfall, Fig. 4(b) demonstrates that small drops are underestimated and medium drops overestimated if using the MGD with 3 moments or fewer. One concern raised by the results of Fig. 4 is whether the use of the MGD, and its limited representation of the full PDF of drop sizes, can cause biases in modeled or retrieved rain rates. To examine this is quite straightforward, in that a sizedependent terminal velocity (Atlas and Ulbrich, 1977) can be assigned for drops of each size bin, with the rain rate calculated 5 as the integral product of the velocity distribution and the third moment of N (D). This can then be compared between DSD representations. Using all raining OceanRAIN observations, use of the MGD fit was found to result in a small overestimation of rain rates, by 0.06 mm h −1 or 1.9%. Using the same definitions as above, this underestimation was slightly less pronounced at high latitudes than for tropical locations, 1.5% versus 2.1%. This is due to underestimation of small drops by the MGD fit, as small drops have lower terminal velocities than larger drops, and with RWC being equal this can have a minor impact on 10 resultant fluxes of precipitation.
Much of the spread that exists in the full OceanRAIN PDF is due to the use of raw observational data that contain discontinuities between size bins and some degree of instrument error. It is clear, however, that much of the spectral power in Fig. 3 is not captured by any one MGD curve. While the exponential line and µ = 3 curves do a reasonable job at matching the PDF for larger drop sizes, the µ = −2 curve performs much better for smaller diameters. This suggests that a 4-parameter "generalized 15 10 gamma" fit might be optimal for ocean DSDs, a finding echoed in another recent study of disdrometer data (Thurai and Bringi, 2018). Use of the 3-parameter MGD can lead to some systematic biases in drop size representation as seen in Fig. 4(a). These biases can be regionally dependent, as shown by the higher number concentrations of small drops seen in high latitudes relative to the tropics, as seen in Fig. 4(b).

GMM states 5
As shown in Fig. 3, the MGD with a low µ value does a reasonable job at capturing the main power of the observed PDF.
However, a great deal of spread exists that is not captured by any one curve. With this in mind, GMM was employed to investigate if a finite number of DSD shapes without a predefined functional form could better capture this variability. To  In contrast to the example plots of Fig. 5, Fig. 6 shows the mean GMM curves that arise from running GMM with a few different N GM M values. Again, this is from running GMM on the raw disdrometer data, with only the number of classes set a priori. For comparison, reference lines of MGD with sample µ values are also given. Note that for each panel in Fig. 6, a majority of the GMM-derived DSDs feature more small drops than given by even the exponential (µ = 0) line. In the simplest case with only two classes possible (first panel of Fig. 6), the DSD shape that best captures the majority of the OceanRAIN data set's variability (at least in terms of frequency of occurrence) is a shape that is more sloped than the exponential DSD, with many small drops and very few large drops. This particular shape is common to all the GMM realizations, with even more steeply sloped curves found as GMM states are added. have DSD shapes from individual observations that resemble a 3-moment MGD across the whole size domain. In many cases the GMM method prefers states with more steeply sloped DSDs and more small drops than the sample MGD curves given. In fact, it takes higher values of N GM M (such as in Fig. 6 with N GM M =14) before strongly peaked DSD shapes reminiscent of MGD with a large µ value emerge. In other words, DSDs with few small drops, a strong peak of drops around D m , and for which an exponential is a very poor approximation, are not very common. This can also be seen in Fig. 3, as scant spectral 5 power is seen near the bottom left of that plot. The GMM framework as applied to the DSD problem seems to offer the promise of finding a finite number of distinct shapes with which global DSD variability can be described, a la Dolan et al. (2018), with the benefit of not constraining the type of shapes found. To investigate this, GMM was used in many iterations for randomly sampled subsets of the data to assess if an optimal number of states exist that describe the global shape variability. In this experiment N GM M was varied from 2 to 10 12. The Bayesian Information Criterion (Eq. 4) gauges whether adding further states better describes the data or not, shown in Fig. 7. BIC plateaus and continues a slight decrease for GMM states beyond about N GM M = 8, indicating that there is no singular set of GMM-derived DSD shapes that outperforms the others. Instead, oceanic DSD shape variability proves to be a true continuum that is not easily decomposed into a linear combination of a finite set of curves.
A corollary of the finding that a singular, optimal set of GMM-derived curves does not exist is that the observed DSD shapes 15 do not display particularly predictable regional patterns. The shapes observed are not especially distinct when decoupled from RWC whether considering the DSDs regionally or say across SST regimes. The GMM-derived shapes are not particularly tied to one region or another, a finding that echoes Fig. 1. This is in contrast to some studies' success in pulling regional attributes out of large data sets via GMM without including location information, as done here (Jones et al., 2019). The only area of OceanRAIN sampling that is particularly distinct in the distribution of GMM states is from observations in stratocumulus regions, where the GMM states characterized by steeply sloped DSD curves with a large number of small drops are dominant.
Otherwise, the GMM states are not strongly tied to particular sampling regions. This tendency changes if DSD is not decoupled 5 from RWC, as RWC regimes are more tied to regional meteorology. But with respect to the retrieval problem, where it is convenient to separate the DSD shape from RWC as in Eq. 3, the GMM approach does not provide a magic bullet.

Radiative transfer impacts
An overlooked aspect of assuming a DSD a priori, or even just assuming the general shape of the DSD a priori, is that this will introduce forward model errors in retrievals and data assimilation. These errors can be strongly correlated across nearby 10 frequencies and can thus cause systematic biases in variational systems if not taken into account. An example of including this type of forward model error into a variational rainfall retrieval for GPM was presented by Duncan et al. (2018). Instead, the focus in this section is investigating the extent of forward model response inherent to variations in natural drop populations, without fitting a functional form to the observed drop counts. Because water content or rain rate is usually the sought parameter from remote sensing retrievals, the results are separated along those lines. 15 The Atmospheric Radiative Transfer Simulator (ARTS) version 2.3 (Eriksson et al., 2011;Buehler et al., 2018) was used to perform forward model simulations. The ARTS model can handle custom particle size distributions and habits as well as prescribed DSDs such as the MGD. Thus with the full raw data from OceanRAIN it is possible to simulate the interaction of radiation with drop populations without making any simplifications involving the drops' functional form. To approximate the impact on a sensor such as GMI on GPM, simulations were run using the GMI geometry and three GMI frequencies: 20 18.7, 36.64, and 89.0 GHz. Because the surface-based disdrometer data inherently lack vertical information, hydrometeor and humidity profiles need to be assumed. To avoid complications from inclusion of any ice scattering species, the setup is for warm rain: a 1 km rain layer defined by the RWC and DSD observed, with a 1 km liquid cloud layer of 200 g m −2 above characteristic of a raining warm cloud (Lebsock et al., 2008). The surface properties and humidity profile are typical of a tropical scene, with the surface emissivity calculated using TESSEM2 (Prigent et al., 2017). DSD properties are constant within the rain layer and 25 the cloud layer is also homogeneous. Simulation code is available (Duncan, 2019). variability for a given RWP, with the standard deviation of the T B response usually about half of the net response. This is a significant error source for warm rain estimation, as the difference between a RWP of 0.2 and 0.3 kg m −2 would be difficult to distinguish using these frequencies alone due to the overlapping forward model error bounds.
To address the point-to-area issue of comparing OceanRAIN observations integrated every minute with those of a spaceborne passive microwave or radar footprint, which is 5 km in the best case, Fig. 8(b) shows a sample result if the disdrometer data 5 are averaged in time. Averaging in time is performed because it approximates a spatial average, absent other observing points.
Specifically, a nominal 6 minute window was used to average consecutive raining disdrometer measurements. Non-raining points were not included or added if the OceanRAIN points were discontinuous in time. Fig. 8(b) shows that the results are quite similar to the native disdrometer data used in panel (a), and thus the maximum forward model errors observed by a sensor such as GMI should not be markedly different. 10 Without needing to assume a model atmosphere, the variability of radar reflectivities can be simulated with the measured volume of drops alone and the T-matrix method . Figure 9 gives the simulated radar reflectivity response over a range of rain rates using the OceanRAIN observations. As with the passive sensor simulations, this demonstrates that DSD variability can cause significant differences in the radiative properties of a volume of drops even for equivalent rain rates or water contents. As with Fig. 8 Rayleigh regime or partly in the Mie regime. The K A band is less effected by DSD variations in both the passive and active simulations while scaling mostly linearly with increasing RWC or rain rate.

Summary and conclusions
This study has investigated the variability of raindrop size distributions over the global oceans in a variety of contexts relevant to retrievals and atmospheric modeling. Methods to attach a functional form to raindrop populations vary, but have largely been The disdrometer data were shown to have limited dependence on latitude or SST (Fig. 1) when quantified using parameters of the normalized gamma distribution (Eq. 3). The mean and median of D m tend to vary within 0.1 mm across all regions, with 10 ±σ of about 0.2 mm. Most observations of log 10 (N w ) fall within 3.0 to 4.3, with a weak correlation observed between N w and SST. These parameters from OceanRAIN were also compared to the leading estimates from a satellite platform (Fig. 2); comparisons with GPM matched relatively well for distributions of D m but less so for N w . Both parameters appear to be too peaked from the GPM retrieval, likely a result of strong influence from that retrieval's a priori state as D m = 1.0 and log 10 (N w ) = 4.0 was commonly seen. The data sets observe similar spreads in the distributions of D m , but the disdrometers 15 observe significantly more variability in N w than seen by GPM; the middle 90% of GPM N w retrievals fall within one order of magnitude, whereas the middle 90% of disdrometer observations span 2.2 orders of magnitude. It is speculated that the GPM retrievals may be over-constrained, although it was expected that the point measurements of the disdrometer would display greater variability than those from satellite sources due to spatial considerations alone.
Usage of the normalized gamma function to encapsulate the observed DSD behavior was questioned, as it appears more applicable in the Tropics than for higher latitude populations (Fig. 4). Its use can cause systematic biases in rain rate estimation, 5 quantified to be in the mean a -2% error relative to total accumulation calculated with the raw disdrometer size data. This is a relatively small error for total accumulation because the smallest drops that are most misrepresented by the normalized gamma formalism account for relatively little of the total mass flux, however for about 3% of cases this is an error of −0.5 mm h −1 or more, and can thus be significant. For many applications, an exponential DSD would be simpler and more appropriate than a MGD for oceanic rainfall (Fig. 3), but of course does not encapsulate the range of variability that exists, which may be better 10 represented by a generalized gamma approach (Thurai and Bringi, 2018).
Radiative properties of raindrop populations can vary rapidly for low frequency microwaves, manifest in Fig. 8 as uncertainty makes up approximately half the radiative signal at 18 GHz but much less at higher frequencies. This is because the presence of a few larger drops can cause non-negligible Mie scattering that impacts the otherwise emission-dominated radiative signal and Rayleigh scattering from smaller drops, an effect that diminishes as frequency increases. Fig. 9 also showed this effect, 15 with lower frequencies exhibiting greater uncertainty for a given RWC or rain rate due to observed DSD variability. Whereas the radiative uncertainty is similar for light rain rates, modeled variability can be 2-3 times greater at K U rather than K A band, true for passive and active simulations. These ranges of forward model variability however represent a worst case scenario for satellite retrievals or data assimilation, as any skill in assuming or retrieving the DSD would shrink these ranges. This passive forward model variability can even be viewed favorably, as it demonstrates sensitivity to the DSD at low frequencies that may 20 aid DSD retrievals. Simulations comparing forward model errors caused by using a GMM-derived or MGD state compared to the true DSD state showed that a high N GM M value was needed for the GMM states to outperform the 3-moment MGD for forward model errors (not shown). This is in line with Fig. 7, but also indicative that it is hard for a single-moment scheme such as GMM to compete without having a large number of possible states.
This exploration of DSD shape "distinctiveness" was motivated by the remote sensing and modeling communities' need for 25 simple but accurate parameterizations of rainwater's size distribution. For instance, if a region or rainfall regime tends to exhibit one or two DSD shapes, this simplifies a multidimensional problem considerably. The results, however, demonstrate that simple separation of DSD shapes by latitude and SST, or by other variables such as dewpoint temperature and RWC (not shown), does not significantly simplify the DSD problem. The limited spatiotemporal sampling of OceanRAIN meant that further subdivision of regional data for seasonal shifts in DSD was not possible. The conclusion is then that global oceanic DSD variability, though 30 more uniform than over land surfaces, is complex and not easily reduced to a single moment parameterization or a small set of possible shapes.
Code availability. The code used for analysis is all available in the form of iPython notebooks via a Zenodo archive, found in the references.