esda.smoothing
— Smoothing of spatial rates¶
New in version 1.0.

class
pysal.esda.smoothing.
Excess_Risk
(e, b)[source]¶ Excess Risk
Parameters: 
r
¶ array (n, 1) – execess risk values
Examples
Reading data in stl_hom.csv into stl to extract values for event and populationatrisk variables
>>> stl = pysal.open(pysal.examples.get_path('stl_hom.csv'), 'r')
The 11th and 14th columns in stl_hom.csv includes the number of homocides and population. Creating two arrays from these columns.
>>> stl_e, stl_b = np.array(stl[:,10]), np.array(stl[:,13])
Creating an instance of Excess_Risk class using stl_e and stl_b
>>> er = Excess_Risk(stl_e, stl_b)
Extracting the excess risk values through the property r of the Excess_Risk instance, er
>>> er.r[:10] array([ 0.20665681, 0.43613787, 0.42078261, 0.22066928, 0.57981596, 0.35301709, 0.56407549, 0.17020994, 0.3052372 , 0.25821905])


class
pysal.esda.smoothing.
Empirical_Bayes
(e, b)[source]¶ Aspatial Empirical Bayes Smoothing
Parameters: 
r
¶ array – (n, 1), rate values from Empirical Bayes Smoothing
Examples
Reading data in stl_hom.csv into stl to extract values for event and populationatrisk variables
>>> stl = pysal.open(pysal.examples.get_path('stl_hom.csv'), 'r')
The 11th and 14th columns in stl_hom.csv includes the number of homocides and population. Creating two arrays from these columns.
>>> stl_e, stl_b = np.array(stl[:,10]), np.array(stl[:,13])
Creating an instance of Empirical_Bayes class using stl_e and stl_b
>>> eb = Empirical_Bayes(stl_e, stl_b)
Extracting the risk values through the property r of the Empirical_Bayes instance, eb
>>> eb.r[:10] array([ 2.36718950e05, 4.54539167e05, 4.78114019e05, 2.76907146e05, 6.58989323e05, 3.66494122e05, 5.79952721e05, 2.03064590e05, 3.31152999e05, 3.02748380e05])


class
pysal.esda.smoothing.
Spatial_Empirical_Bayes
(e, b, w)[source]¶ Spatial Empirical Bayes Smoothing
Parameters: 
r
¶ array (n, 1) – rate values from Empirical Bayes Smoothing
Examples
Reading data in stl_hom.csv into stl to extract values for event and populationatrisk variables
>>> stl = pysal.open(pysal.examples.get_path('stl_hom.csv'), 'r')
The 11th and 14th columns in stl_hom.csv includes the number of homocides and population. Creating two arrays from these columns.
>>> stl_e, stl_b = np.array(stl[:,10]), np.array(stl[:,13])
Creating a spatial weights instance by reading in stl.gal file.
>>> stl_w = pysal.open(pysal.examples.get_path('stl.gal'), 'r').read()
Ensuring that the elements in the spatial weights instance are ordered by the given sequential numbers from 1 to the number of observations in stl_hom.csv
>>> if not stl_w.id_order_set: stl_w.id_order = range(1,len(stl) + 1)
Creating an instance of Spatial_Empirical_Bayes class using stl_e, stl_b, and stl_w
>>> s_eb = Spatial_Empirical_Bayes(stl_e, stl_b, stl_w)
Extracting the risk values through the property r of s_eb
>>> s_eb.r[:10] array([ 4.01485749e05, 3.62437513e05, 4.93034844e05, 5.09387329e05, 3.72735210e05, 3.69333797e05, 5.40245456e05, 2.99806055e05, 3.73034109e05, 3.47270722e05])


class
pysal.esda.smoothing.
Spatial_Rate
(e, b, w)[source]¶ Spatial Rate Smoothing
Parameters: 
r
¶ array (n, 1) – rate values from spatial rate smoothing
Examples
Reading data in stl_hom.csv into stl to extract values for event and populationatrisk variables
>>> stl = pysal.open(pysal.examples.get_path('stl_hom.csv'), 'r')
The 11th and 14th columns in stl_hom.csv includes the number of homocides and population. Creating two arrays from these columns.
>>> stl_e, stl_b = np.array(stl[:,10]), np.array(stl[:,13])
Creating a spatial weights instance by reading in stl.gal file.
>>> stl_w = pysal.open(pysal.examples.get_path('stl.gal'), 'r').read()
Ensuring that the elements in the spatial weights instance are ordered by the given sequential numbers from 1 to the number of observations in stl_hom.csv
>>> if not stl_w.id_order_set: stl_w.id_order = range(1,len(stl) + 1)
Creating an instance of Spatial_Rate class using stl_e, stl_b, and stl_w
>>> sr = Spatial_Rate(stl_e,stl_b,stl_w)
Extracting the risk values through the property r of sr
>>> sr.r[:10] array([ 4.59326407e05, 3.62437513e05, 4.98677081e05, 5.09387329e05, 3.72735210e05, 4.01073093e05, 3.79372794e05, 3.27019246e05, 4.26204928e05, 3.47270722e05])


class
pysal.esda.smoothing.
Kernel_Smoother
(e, b, w)[source]¶ Kernal smoothing
Parameters: 
r
¶ array (n, 1) – rate values from spatial rate smoothing
Examples
Creating an array including event values for 6 regions
>>> e = np.array([10, 1, 3, 4, 2, 5])
Creating another array including populationatrisk values for the 6 regions
>>> b = np.array([100, 15, 20, 20, 80, 90])
Creating a list containing geographic coordinates of the 6 regions’ centroids
>>> points=[(10, 10), (20, 10), (40, 10), (15, 20), (30, 20), (30, 30)]
Creating a kernelbased spatial weights instance by using the above points
>>> kw=Kernel(points)
Ensuring that the elements in the kernelbased weights are ordered by the given sequential numbers from 0 to 5
>>> if not kw.id_order_set: kw.id_order = range(0,len(points))
Applying kernel smoothing to e and b
>>> kr = Kernel_Smoother(e, b, kw)
Extracting the smoothed rates through the property r of the Kernel_Smoother instance
>>> kr.r array([ 0.10543301, 0.0858573 , 0.08256196, 0.09884584, 0.04756872, 0.04845298])


class
pysal.esda.smoothing.
Age_Adjusted_Smoother
(e, b, w, s, alpha=0.05)[source]¶ Ageadjusted rate smoothing
Parameters:  e (array (n*h, 1)) – event variable measured for each age group across n spatial units
 b (array (n*h, 1)) – population at risk variable measured for each age group across n spatial units
 w (spatial weights instance) –
 s (array (n*h, 1)) – standard population for each age group across n spatial units

r
¶ array (n, 1) – rate values from spatial rate smoothing
Notes
Weights used to smooth agespecific events and populations are simple binary weights
Examples
Creating an array including 12 values for the 6 regions with 2 age groups
>>> e = np.array([10, 8, 1, 4, 3, 5, 4, 3, 2, 1, 5, 3])
Creating another array including 12 populationatrisk values for the 6 regions
>>> b = np.array([100, 90, 15, 30, 25, 20, 30, 20, 80, 80, 90, 60])
For age adjustment, we need another array of values containing standard population s includes standard population data for the 6 regions
>>> s = np.array([98, 88, 15, 29, 20, 23, 33, 25, 76, 80, 89, 66])
Creating a list containing geographic coordinates of the 6 regions’ centroids
>>> points=[(10, 10), (20, 10), (40, 10), (15, 20), (30, 20), (30, 30)]
Creating a kernelbased spatial weights instance by using the above points
>>> kw=Kernel(points)
Ensuring that the elements in the kernelbased weights are ordered by the given sequential numbers from 0 to 5
>>> if not kw.id_order_set: kw.id_order = range(0,len(points))
Applying ageadjusted smoothing to e and b
>>> ar = Age_Adjusted_Smoother(e, b, kw, s)
Extracting the smoothed rates through the property r of the Age_Adjusted_Smoother instance
>>> ar.r array([ 0.10519625, 0.08494318, 0.06440072, 0.06898604, 0.06952076, 0.05020968])

classmethod
by_col
(df, e, b, w=None, s=None, **kwargs)[source]¶ Compute smoothing by columns in a dataframe.
Parameters:  df (pandas.DataFrame) – a dataframe containing the data to be smoothed
 e (string or list of strings) – the name or names of columns containing event variables to be smoothed
 b (string or list of strings) – the name or names of columns containing the population variables to be smoothed
 w (pysal.weights.W or list of pysal.weights.W) – the spatial weights object or objects to use with the eventpopulation pairs. If not provided and a weights object is in the dataframe’s metadata, that weights object will be used.
 s (string or list of strings) – the name or names of columns to use as a standard population variable for the events e and atrisk populations b.
 inplace (bool) – a flag denoting whether to output a copy of df with the relevant smoothed columns appended, or to append the columns directly to df itself.
 **kwargs (optional keyword arguments) – optional keyword options that are passed directly to the smoother.
Returns:  a copy of df containing the columns. Or, if inplace, this returns
 None, but implicitly adds columns to df.

class
pysal.esda.smoothing.
Disk_Smoother
(e, b, w)[source]¶ Locally weighted averages or disk smoothing
Parameters: 
r
¶ array (n, 1) – rate values from disk smoothing
Examples
Reading data in stl_hom.csv into stl to extract values for event and populationatrisk variables
>>> stl = pysal.open(pysal.examples.get_path('stl_hom.csv'), 'r')
The 11th and 14th columns in stl_hom.csv includes the number of homocides and population. Creating two arrays from these columns.
>>> stl_e, stl_b = np.array(stl[:,10]), np.array(stl[:,13])
Creating a spatial weights instance by reading in stl.gal file.
>>> stl_w = pysal.open(pysal.examples.get_path('stl.gal'), 'r').read()
Ensuring that the elements in the spatial weights instance are ordered by the given sequential numbers from 1 to the number of observations in stl_hom.csv
>>> if not stl_w.id_order_set: stl_w.id_order = range(1,len(stl) + 1)
Applying disk smoothing to stl_e and stl_b
>>> sr = Disk_Smoother(stl_e,stl_b,stl_w)
Extracting the risk values through the property r of s_eb
>>> sr.r[:10] array([ 4.56502262e05, 3.44027685e05, 3.38280487e05, 4.78530468e05, 3.12278573e05, 2.22596997e05, 2.67074856e05, 2.36924573e05, 3.48801587e05, 3.09511832e05])


class
pysal.esda.smoothing.
Spatial_Median_Rate
(e, b, w, aw=None, iteration=1)[source]¶ Spatial Median Rate Smoothing
Parameters:  e (array (n, 1)) – event variable measured across n spatial units
 b (array (n, 1)) – population at risk variable measured across n spatial units
 w (spatial weights instance) –
 aw (array (n, 1)) – auxiliary weight variable measured across n spatial units
 iteration (integer) – the number of interations

r
¶ array (n, 1) – rate values from spatial median rate smoothing

w
¶ spatial weights instance

aw
¶ array (n, 1) – auxiliary weight variable measured across n spatial units
Examples
Reading data in stl_hom.csv into stl to extract values for event and populationatrisk variables
>>> stl = pysal.open(pysal.examples.get_path('stl_hom.csv'), 'r')
The 11th and 14th columns in stl_hom.csv includes the number of homocides and population. Creating two arrays from these columns.
>>> stl_e, stl_b = np.array(stl[:,10]), np.array(stl[:,13])
Creating a spatial weights instance by reading in stl.gal file.
>>> stl_w = pysal.open(pysal.examples.get_path('stl.gal'), 'r').read()
Ensuring that the elements in the spatial weights instance are ordered by the given sequential numbers from 1 to the number of observations in stl_hom.csv
>>> if not stl_w.id_order_set: stl_w.id_order = range(1,len(stl) + 1)
Computing spatial median rates without iteration
>>> smr0 = Spatial_Median_Rate(stl_e,stl_b,stl_w)
Extracting the computed rates through the property r of the Spatial_Median_Rate instance
>>> smr0.r[:10] array([ 3.96047383e05, 3.55386859e05, 3.28308921e05, 4.30731238e05, 3.12453969e05, 1.97300409e05, 3.10159267e05, 2.19279204e05, 2.93763432e05, 2.93763432e05])
Recomputing spatial median rates with 5 iterations
>>> smr1 = Spatial_Median_Rate(stl_e,stl_b,stl_w,iteration=5)
Extracting the computed rates through the property r of the Spatial_Median_Rate instance
>>> smr1.r[:10] array([ 3.11293620e05, 2.95956330e05, 3.11293620e05, 3.10159267e05, 2.98436066e05, 2.76406686e05, 3.10159267e05, 2.94788171e05, 2.99460806e05, 2.96981070e05])
Computing spatial median rates by using the base variable as auxilliary weights without iteration
>>> smr2 = Spatial_Median_Rate(stl_e,stl_b,stl_w,aw=stl_b)
Extracting the computed rates through the property r of the Spatial_Median_Rate instance
>>> smr2.r[:10] array([ 5.77412020e05, 4.46449551e05, 5.77412020e05, 5.77412020e05, 4.46449551e05, 3.61363528e05, 3.61363528e05, 4.46449551e05, 5.77412020e05, 4.03987355e05])
Recomputing spatial median rates by using the base variable as auxilliary weights with 5 iterations
>>> smr3 = Spatial_Median_Rate(stl_e,stl_b,stl_w,aw=stl_b,iteration=5)
Extracting the computed rates through the property r of the Spatial_Median_Rate instance
>>> smr3.r[:10] array([ 3.61363528e05, 4.46449551e05, 3.61363528e05, 3.61363528e05, 4.46449551e05, 3.61363528e05, 3.61363528e05, 4.46449551e05, 3.61363528e05, 4.46449551e05]) >>>

class
pysal.esda.smoothing.
Spatial_Filtering
(bbox, data, e, b, x_grid, y_grid, r=None, pop=None)[source]¶ Spatial Filtering
Parameters:  bbox (a list of two lists where each list is a pair of coordinates) – a bounding box for the entire n spatial units
 data (array (n, 2)) – x, y coordinates
 e (array (n, 1)) – event variable measured across n spatial units
 b (array (n, 1)) – population at risk variable measured across n spatial units
 x_grid (integer) – the number of cells on x axis
 y_grid (integer) – the number of cells on y axis
 r (float) – fixed radius of a moving window
 pop (integer) – population threshold to create adaptive moving windows

grid
¶ array (x_grid*y_grid, 2) – x, y coordinates for grid points

r
¶ array (x_grid*y_grid, 1) – rate values for grid points
Notes
No tool is provided to find an optimal value for r or pop.
Examples
Reading data in stl_hom.csv into stl to extract values for event and populationatrisk variables
>>> stl = pysal.open(pysal.examples.get_path('stl_hom.csv'), 'r')
Reading the stl data in the WKT format so that we can easily extract polygon centroids
>>> fromWKT = pysal.core.util.WKTParser() >>> stl.cast('WKT',fromWKT)
Extracting polygon centroids through iteration
>>> d = np.array([i.centroid for i in stl[:,0]])
Specifying the bounding box for the stl_hom data. The bbox should includes two points for the leftbottom and the righttop corners
>>> bbox = [[92.700676, 36.881809], [87.916573, 40.3295669]]
The 11th and 14th columns in stl_hom.csv includes the number of homocides and population. Creating two arrays from these columns.
>>> stl_e, stl_b = np.array(stl[:,10]), np.array(stl[:,13])
Applying spatial filtering by using a 10*10 mesh grid and a moving window with 2 radius
>>> sf_0 = Spatial_Filtering(bbox,d,stl_e,stl_b,10,10,r=2)
Extracting the resulting rates through the property r of the Spatial_Filtering instance
>>> sf_0.r[:10] array([ 4.23561763e05, 4.45290850e05, 4.56456221e05, 4.49133384e05, 4.39671835e05, 4.44903042e05, 4.19845497e05, 4.11936548e05, 3.93463504e05, 4.04376345e05])
Applying another spatial filtering by allowing the moving window to grow until 600000 people are found in the window
>>> sf = Spatial_Filtering(bbox,d,stl_e,stl_b,10,10,pop=600000)
Checking the size of the reulting array including the rates
>>> sf.r.shape (100,)
Extracting the resulting rates through the property r of the Spatial_Filtering instance
>>> sf.r[:10] array([ 3.73728738e05, 4.04456300e05, 4.04456300e05, 3.81035327e05, 4.54831940e05, 4.54831940e05, 3.75658628e05, 3.75658628e05, 3.75658628e05, 3.75658628e05])

classmethod
by_col
(df, e, b, x_grid, y_grid, geom_col='geometry', **kwargs)[source]¶ Compute smoothing by columns in a dataframe. The bounding box and point information is computed from the geometry column.
Parameters:  df (pandas.DataFrame) – a dataframe containing the data to be smoothed
 e (string or list of strings) – the name or names of columns containing event variables to be smoothed
 b (string or list of strings) – the name or names of columns containing the population variables to be smoothed
 x_grid (integer) – number of grid cells to use along the xaxis
 y_grid (integer) – number of grid cells to use along the yaxis
 geom_col (string) – the name of the column in the dataframe containing the geometry information.
 **kwargs (optional keyword arguments) – optional keyword options that are passed directly to the smoother.
Returns:  a new dataframe of dimension (x_grid*y_grid, 3), containing the
 coordinates of the grid cells and the rates associated with those grid
 cells.

class
pysal.esda.smoothing.
Headbanging_Triples
(data, w, k=5, t=3, angle=135.0, edgecor=False)[source]¶ Generate a pseudo spatial weights instance that contains headbanging triples
Parameters:  data (array (n, 2)) – numpy array of x, y coordinates
 w (spatial weights instance) –
 k (integer number of nearest neighbors) –
 t (integer) – the number of triples
 angle (integer between 0 and 180) – the angle criterium for a set of triples
 edgecorr (boolean) – whether or not correction for edge points is made

triples
¶ dictionary – key is observation record id, value is a list of lists of triple ids

extra
¶ dictionary – key is observation record id, value is a list of the following: tuple of original triple observations distance between original triple observations distance between an original triple observation and its extrapolated point
Examples
importing knearest neighbor weights creator
>>> from pysal import knnW_from_array
Reading data in stl_hom.csv into stl_db to extract values for event and populationatrisk variables
>>> stl_db = pysal.open(pysal.examples.get_path('stl_hom.csv'),'r')
Reading the stl data in the WKT format so that we can easily extract polygon centroids
>>> fromWKT = pysal.core.util.WKTParser() >>> stl_db.cast('WKT',fromWKT)
Extracting polygon centroids through iteration
>>> d = np.array([i.centroid for i in stl_db[:,0]])
Using the centroids, we create a 5nearst neighbor weights
>>> w = knnW_from_array(d,k=5)
Ensuring that the elements in the spatial weights instance are ordered by the order of stl_db’s IDs
>>> if not w.id_order_set: w.id_order = w.id_order
Finding headbaning triples by using 5 nearest neighbors
>>> ht = Headbanging_Triples(d,w,k=5)
Checking the members of triples
>>> for k, item in ht.triples.items()[:5]: print k, item 0 [(5, 6), (10, 6)] 1 [(4, 7), (4, 14), (9, 7)] 2 [(0, 8), (10, 3), (0, 6)] 3 [(4, 2), (2, 12), (8, 4)] 4 [(8, 1), (12, 1), (8, 9)]
Opening sids2.shp file
>>> sids = pysal.open(pysal.examples.get_path('sids2.shp'),'r')
Extracting the centroids of polygons in the sids data
>>> sids_d = np.array([i.centroid for i in sids])
Creating a 5nearest neighbors weights from the sids centroids
>>> sids_w = knnW_from_array(sids_d,k=5)
Ensuring that the members in sids_w are ordered by the order of sids_d’s ID
>>> if not sids_w.id_order_set: sids_w.id_order = sids_w.id_order
Finding headbaning triples by using 5 nearest neighbors
>>> s_ht = Headbanging_Triples(sids_d,sids_w,k=5)
Checking the members of the found triples
>>> for k, item in s_ht.triples.items()[:5]: print k, item 0 [(1, 18), (1, 21), (1, 33)] 1 [(2, 40), (2, 22), (22, 40)] 2 [(39, 22), (1, 9), (39, 17)] 3 [(16, 6), (19, 6), (20, 6)] 4 [(5, 15), (27, 15), (35, 15)]
Finding headbanging triples by using 5 nearest neighbors with edge correction
>>> s_ht2 = Headbanging_Triples(sids_d,sids_w,k=5,edgecor=True)
Checking the members of the found triples
>>> for k, item in s_ht2.triples.items()[:5]: print k, item 0 [(1, 18), (1, 21), (1, 33)] 1 [(2, 40), (2, 22), (22, 40)] 2 [(39, 22), (1, 9), (39, 17)] 3 [(16, 6), (19, 6), (20, 6)] 4 [(5, 15), (27, 15), (35, 15)]
Checking the extrapolated point that is introduced into the triples during edge correction
>>> extrapolated = s_ht2.extra[72]
Checking the observation IDs constituting the extrapolated triple
>>> extrapolated[0] (89, 77)
Checking the distances between the extraploated point and the observation 89 and 77
>>> round(extrapolated[1],5), round(extrapolated[2],6) (0.33753, 0.302707)

class
pysal.esda.smoothing.
Headbanging_Median_Rate
(e, b, t, aw=None, iteration=1)[source]¶ Headbaning Median Rate Smoothing
Parameters:  e (array (n, 1)) – event variable measured across n spatial units
 b (array (n, 1)) – population at risk variable measured across n spatial units
 t (Headbanging_Triples instance) –
 aw (array (n, 1)) – auxilliary weight variable measured across n spatial units
 iteration (integer) – the number of iterations

r
¶ array (n, 1) – rate values from headbanging median smoothing
Examples
importing knearest neighbor weights creator
>>> from pysal import knnW_from_array
opening the sids2 shapefile
>>> sids = pysal.open(pysal.examples.get_path('sids2.shp'), 'r')
extracting the centroids of polygons in the sids2 data
>>> sids_d = np.array([i.centroid for i in sids])
creating a 5nearest neighbors weights from the centroids
>>> sids_w = knnW_from_array(sids_d,k=5)
ensuring that the members in sids_w are ordered
>>> if not sids_w.id_order_set: sids_w.id_order = sids_w.id_order
 finding headbanging triples by using 5 neighbors
 return outdf
>>> s_ht = Headbanging_Triples(sids_d,sids_w,k=5)
reading in the sids2 data table
>>> sids_db = pysal.open(pysal.examples.get_path('sids2.dbf'), 'r')
extracting the 10th and 9th columns in the sids2.dbf and using data values as event and populationatrisk variables
>>> s_e, s_b = np.array(sids_db[:,9]), np.array(sids_db[:,8])
computing headbanging median rates from s_e, s_b, and s_ht
>>> sids_hb_r = Headbanging_Median_Rate(s_e,s_b,s_ht)
extracting the computed rates through the property r of the Headbanging_Median_Rate instance
>>> sids_hb_r.r[:5] array([ 0.00075586, 0. , 0.0008285 , 0.0018315 , 0.00498891])
recomputing headbanging median rates with 5 iterations
>>> sids_hb_r2 = Headbanging_Median_Rate(s_e,s_b,s_ht,iteration=5)
extracting the computed rates through the property r of the Headbanging_Median_Rate instance
>>> sids_hb_r2.r[:5] array([ 0.0008285 , 0.00084331, 0.00086896, 0.0018315 , 0.00498891])
recomputing headbanging median rates by considring a set of auxilliary weights
>>> sids_hb_r3 = Headbanging_Median_Rate(s_e,s_b,s_ht,aw=s_b)
extracting the computed rates through the property r of the Headbanging_Median_Rate instance
>>> sids_hb_r3.r[:5] array([ 0.00091659, 0. , 0.00156838, 0.0018315 , 0.00498891])

classmethod
by_col
(df, e, b, t=None, geom_col='geometry', inplace=False, **kwargs)[source]¶ Compute smoothing by columns in a dataframe. The bounding box and point information is computed from the geometry column.
Parameters:  df (pandas.DataFrame) – a dataframe containing the data to be smoothed
 e (string or list of strings) – the name or names of columns containing event variables to be smoothed
 b (string or list of strings) – the name or names of columns containing the population variables to be smoothed
 t (Headbanging_Triples instance or list of Headbanging_Triples) – list of headbanging triples instances. If not provided, this is computed from the geometry column of the dataframe.
 geom_col (string) – the name of the column in the dataframe containing the geometry information.
 inplace (bool) – a flag denoting whether to output a copy of df with the relevant smoothed columns appended, or to append the columns directly to df itself.
 **kwargs (optional keyword arguments) – optional keyword options that are passed directly to the smoother.
Returns:  a new dataframe containing the smoothed Headbanging Median Rates for the
 event/population pairs. If done inplace, there is no return value and
 df is modified in place.

pysal.esda.smoothing.
flatten
(l, unique=True)[source]¶ flatten a list of lists
Parameters:  l (list) – of lists
 unique (boolean) – whether or not only unique items are wanted (default=True)
Returns: of single items
Return type: list
Examples
Creating a sample list whose elements are lists of integers
>>> l = [[1, 2], [3, 4, ], [5, 6]]
Applying flatten function
>>> flatten(l) [1, 2, 3, 4, 5, 6]

pysal.esda.smoothing.
weighted_median
(d, w)[source]¶ A utility function to find a median of d based on w
Parameters: Notes
d and w are arranged in the same order
Returns: median of d Return type: float Examples
Creating an array including five integers. We will get the median of these integers.
>>> d = np.array([5,4,3,1,2])
Creating another array including weight values for the above integers. The median of d will be decided with a consideration to these weight values.
>>> w = np.array([10, 22, 9, 2, 5])
Applying weighted_median function
>>> weighted_median(d, w) 4

pysal.esda.smoothing.
sum_by_n
(d, w, n)[source]¶  A utility function to summarize a data array into n values
 after weighting the array with another weight array w
Parameters: Returns: (n, 1), an array with summarized values
Return type: Examples
Creating an array including four integers. We will compute weighted means for every two elements.
>>> d = np.array([10, 9, 20, 30])
Here is another array with the weight values for d’s elements.
>>> w = np.array([0.5, 0.1, 0.3, 0.8])
We specify the number of groups for which the weighted mean is computed.
>>> n = 2
Applying sum_by_n function
>>> sum_by_n(d, w, n) array([ 5.9, 30. ])

pysal.esda.smoothing.
crude_age_standardization
(e, b, n)[source]¶ A utility function to compute rate through crude age standardization
Parameters: Notes
e and b are arranged in the same order
Returns: (n, 1), age standardized rate Return type: array Examples
Creating an array of an event variable (e.g., the number of cancer patients) for 2 regions in each of which 4 age groups are available. The first 4 values are event values for 4 age groups in the region 1, and the next 4 values are for 4 age groups in the region 2.
>>> e = np.array([30, 25, 25, 15, 33, 21, 30, 20])
Creating another array of a populationatrisk variable (e.g., total population) for the same two regions. The order for entering values is the same as the case of e.
>>> b = np.array([100, 100, 110, 90, 100, 90, 110, 90])
Specifying the number of regions.
>>> n = 2
Applying crude_age_standardization function to e and b
>>> crude_age_standardization(e, b, n) array([ 0.2375 , 0.26666667])

pysal.esda.smoothing.
direct_age_standardization
(e, b, s, n, alpha=0.05)[source]¶ A utility function to compute rate through direct age standardization
Parameters:  e (array) – (n*h, 1), event variable measured for each age group across n spatial units
 b (array) – (n*h, 1), population at risk variable measured for each age group across n spatial units
 s (array) – (n*h, 1), standard population for each age group across n spatial units
 n (integer) – the number of spatial units
 alpha (float) – significance level for confidence interval
Notes
e, b, and s are arranged in the same order
Returns: a list of n tuples; a tuple has a rate and its lower and upper limits age standardized rates and confidence intervals Return type: list Examples
Creating an array of an event variable (e.g., the number of cancer patients) for 2 regions in each of which 4 age groups are available. The first 4 values are event values for 4 age groups in the region 1, and the next 4 values are for 4 age groups in the region 2.
>>> e = np.array([30, 25, 25, 15, 33, 21, 30, 20])
Creating another array of a populationatrisk variable (e.g., total population) for the same two regions. The order for entering values is the same as the case of e.
>>> b = np.array([1000, 1000, 1100, 900, 1000, 900, 1100, 900])
For direct age standardization, we also need the data for standard population. Standard population is a reference populationatrisk (e.g., population distribution for the U.S.) whose age distribution can be used as a benchmarking point for comparing age distributions across regions (e.g., population distribution for Arizona and California). Another array including standard population is created.
>>> s = np.array([1000, 900, 1000, 900, 1000, 900, 1000, 900])
Specifying the number of regions.
>>> n = 2
Applying direct_age_standardization function to e and b
>>> [i[0] for i in direct_age_standardization(e, b, s, n)] [0.023744019138755977, 0.026650717703349279]

pysal.esda.smoothing.
indirect_age_standardization
(e, b, s_e, s_b, n, alpha=0.05)[source]¶ A utility function to compute rate through indirect age standardization
Parameters:  e (array) – (n*h, 1), event variable measured for each age group across n spatial units
 b (array) – (n*h, 1), population at risk variable measured for each age group across n spatial units
 s_e (array) – (n*h, 1), event variable measured for each age group across n spatial units in a standard population
 s_b (array) – (n*h, 1), population variable measured for each age group across n spatial units in a standard population
 n (integer) – the number of spatial units
 alpha (float) – significance level for confidence interval
Notes
e, b, s_e, and s_b are arranged in the same order
Returns: a list of n tuples; a tuple has a rate and its lower and upper limits age standardized rate Return type: list Examples
Creating an array of an event variable (e.g., the number of cancer patients) for 2 regions in each of which 4 age groups are available. The first 4 values are event values for 4 age groups in the region 1, and the next 4 values are for 4 age groups in the region 2.
>>> e = np.array([30, 25, 25, 15, 33, 21, 30, 20])
Creating another array of a populationatrisk variable (e.g., total population) for the same two regions. The order for entering values is the same as the case of e.
>>> b = np.array([100, 100, 110, 90, 100, 90, 110, 90])
For indirect age standardization, we also need the data for standard population and event. Standard population is a reference populationatrisk (e.g., population distribution for the U.S.) whose age distribution can be used as a benchmarking point for comparing age distributions across regions (e.g., popoulation distribution for Arizona and California). When the same concept is applied to the event variable, we call it standard event (e.g., the number of cancer patients in the U.S.). Two additional arrays including standard population and event are created.
>>> s_e = np.array([100, 45, 120, 100, 50, 30, 200, 80]) >>> s_b = np.array([1000, 900, 1000, 900, 1000, 900, 1000, 900])
Specifying the number of regions.
>>> n = 2
Applying indirect_age_standardization function to e and b
>>> [i[0] for i in indirect_age_standardization(e, b, s_e, s_b, n)] [0.23723821989528798, 0.2610803324099723]

pysal.esda.smoothing.
standardized_mortality_ratio
(e, b, s_e, s_b, n)[source]¶ A utility function to compute standardized mortality ratio (SMR).
Parameters:  e (array) – (n*h, 1), event variable measured for each age group across n spatial units
 b (array) – (n*h, 1), population at risk variable measured for each age group across n spatial units
 s_e (array) – (n*h, 1), event variable measured for each age group across n spatial units in a standard population
 s_b (array) – (n*h, 1), population variable measured for each age group across n spatial units in a standard population
 n (integer) – the number of spatial units
Notes
e, b, s_e, and s_b are arranged in the same order
Returns: (nx1) Return type: array Examples
Creating an array of an event variable (e.g., the number of cancer patients) for 2 regions in each of which 4 age groups are available. The first 4 values are event values for 4 age groups in the region 1, and the next 4 values are for 4 age groups in the region 2.
>>> e = np.array([30, 25, 25, 15, 33, 21, 30, 20])
Creating another array of a populationatrisk variable (e.g., total population) for the same two regions. The order for entering values is the same as the case of e.
>>> b = np.array([100, 100, 110, 90, 100, 90, 110, 90])
To compute standardized mortality ratio (SMR), we need two additional arrays for standard population and event. Creating s_e and s_b for standard event and population, respectively.
>>> s_e = np.array([100, 45, 120, 100, 50, 30, 200, 80]) >>> s_b = np.array([1000, 900, 1000, 900, 1000, 900, 1000, 900])
Specifying the number of regions.
>>> n = 2
Applying indirect_age_standardization function to e and b
>>> standardized_mortality_ratio(e, b, s_e, s_b, n) array([ 2.48691099, 2.73684211])

pysal.esda.smoothing.
choynowski
(e, b, n, threshold=None)[source]¶ Choynowski map probabilities [Choynowski1959] .
Parameters: Notes
e and b are arranged in the same order
Returns: Return type: array (nx1) Examples
Creating an array of an event variable (e.g., the number of cancer patients) for 2 regions in each of which 4 age groups are available. The first 4 values are event values for 4 age groups in the region 1, and the next 4 values are for 4 age groups in the region 2.
>>> e = np.array([30, 25, 25, 15, 33, 21, 30, 20])
Creating another array of a populationatrisk variable (e.g., total population) for the same two regions. The order for entering values is the same as the case of e.
>>> b = np.array([100, 100, 110, 90, 100, 90, 110, 90])
Specifying the number of regions.
>>> n = 2
Applying indirect_age_standardization function to e and b
>>> print choynowski(e, b, n) [ 0.30437751 0.29367033]

pysal.esda.smoothing.
assuncao_rate
(e, b, geoda=True)[source]¶ Standardized rates used for computing Moran’s I corrected for rate variables.
Parameters:  e (array) – (n, 1), event variable measured at n spatial units
 b (array) – (n, 1), population at risk variable measured at n spatial units
 geoda (boolean) – If True, conform with Geoda implementation: if a<0, variance estimator is v_i = b/x_i for any i; otherwise v_i = a+b/x_i. If False, conform with Assuncao and Reis (1999) [Assuncao1999] : assign v_i = a+b/x_i and check individual v_i: if v_i<0, assign v_i = b/x_i. Default is True.
Notes
The mean and standard deviation used for standardizing rates are those of Empirical Bayes rate estimates. Based on Assuncao and Reis (1999) [Assuncao1999] .
Returns: (n, 1) Return type: array Examples
Create an array of an event variable (e.g., the number of cancer patients) for 8 regions.
>>> e = np.array([30, 25, 25, 15, 33, 21, 30, 20])
Create another array of a populationatrisk variable (e.g., total population) for the same 8 regions. The order for entering values is the same as the case of e.
>>> b = np.array([100, 100, 110, 90, 100, 90, 110, 90])
Computing the rates
>>> print(assuncao_rate(e, b)[:4]) [ 0.95839273 0.03783129 0.51460896 1.61105841]
>>> import pysal >>> w = pysal.open(pysal.examples.get_path("sids2.gal")).read() >>> f = pysal.open(pysal.examples.get_path("sids2.dbf")) >>> e = np.array(f.by_col('SID79')) >>> b = np.array(f.by_col('BIR79')) >>> print(assuncao_rate(e, b)[:4]) [1.48875691 1.78507268 0.34422806 0.26190802]