# weights.Distance — Distance based spatial weights¶

The weights.Distance module provides for spatial weights defined on distance relationships.

New in version 1.0.

class pysal.weights.Distance.KNN(data, k=2, p=2, ids=None, radius=None, distance_metric='euclidean')[source]

Creates nearest neighbor weights matrix based on k nearest neighbors.

Parameters: kdtree (object) – PySAL KDTree or ArcKDTree where KDtree.data is array (n,k) n observations on k characteristics used to measure distances between the n objects k (int) – number of nearest neighbors p (float) – Minkowski p-norm distance metric parameter: 1<=p<=infinity 2: Euclidean distance 1: Manhattan distance Ignored if the KDTree is an ArcKDTree ids (list) – identifiers to attach to each observation w – instance Weights object with binary weights W

Examples

>>> points = [(10, 10), (20, 10), (40, 10), (15, 20), (30, 20), (30, 30)]
>>> kd = pysal.cg.kdtree.KDTree(np.array(points))
>>> wnn2 = pysal.KNN(kd, 2)
>>> [1,3] == wnn2.neighbors[0]
True


ids

>>> wnn2 = KNN(kd,2)
>>> wnn2[0]
{1: 1.0, 3: 1.0}
>>> wnn2[1]
{0: 1.0, 3: 1.0}


now with 1 rather than 0 offset

>>> wnn2 = KNN(kd, 2, ids=range(1,7))
>>> wnn2[1]
{2: 1.0, 4: 1.0}
>>> wnn2[2]
{1: 1.0, 4: 1.0}
>>> 0 in wnn2.neighbors
False


Notes

Ties between neighbors of equal distance are arbitrarily broken.

pysal.weights.W

classmethod from_array(array, **kwargs)[source]

Creates nearest neighbor weights matrix based on k nearest neighbors.

Parameters: array (np.ndarray) – (n, k) array representing n observations on k characteristics used to measure distances between the n objects **kwargs (keyword arguments, see Rook) – w – instance Weights object with binary weights W

Examples

>>> points = [(10, 10), (20, 10), (40, 10), (15, 20), (30, 20), (30, 30)]
>>> wnn2 = pysal.KNN.from_array(points, 2)
>>> [1,3] == wnn2.neighbors[0]
True


ids

>>> wnn2 = KNN.from_array(points,2)
>>> wnn2[0]
{1: 1.0, 3: 1.0}
>>> wnn2[1]
{0: 1.0, 3: 1.0}


now with 1 rather than 0 offset

>>> wnn2 = KNN.from_array(points, 2, ids=range(1,7))
>>> wnn2[1]
{2: 1.0, 4: 1.0}
>>> wnn2[2]
{1: 1.0, 4: 1.0}
>>> 0 in wnn2.neighbors
False


Notes

Ties between neighbors of equal distance are arbitrarily broken.

class: pysal.weights.KNN

pysal.weights.W

classmethod from_dataframe(df, geom_col='geometry', ids=None, **kwargs)[source]

Make KNN weights from a dataframe.

Parameters: df (pandas.dataframe) – a dataframe with a geometry column that can be used to construct a W object geom_col (string) – column name of the geometry stored in df ids (string or iterable) – if string, the column name of the indices from the dataframe if iterable, a list of ids to use for the W if None, df.index is used.

class: pysal.weights.KNN

pysal.weights.W

classmethod from_shapefile(filepath, **kwargs)[source]

Nearest neighbor weights from a shapefile.

Parameters: data (string) – shapefile containing attribute data. k (int) – number of nearest neighbors p (float) – Minkowski p-norm distance metric parameter: 1<=p<=infinity 2: Euclidean distance 1: Manhattan distance ids (list) – identifiers to attach to each observation radius (float) – If supplied arc_distances will be calculated based on the given radius. p will be ignored. w – instance; Weights object with binary weights. KNN

Examples

Polygon shapefile

>>> wc=knnW_from_shapefile(pysal.examples.get_path("columbus.shp"))
>>> "%.4f"%wc.pct_nonzero
'4.0816'
>>> set([2,1]) == set(wc.neighbors[0])
True
>>> wc3=pysal.knnW_from_shapefile(pysal.examples.get_path("columbus.shp"),k=3)
>>> set(wc3.neighbors[0]) == set([2,1,3])
True
>>> set(wc3.neighbors[2]) == set([4,3,0])
True


1 offset rather than 0 offset

>>> wc3_1=knnW_from_shapefile(pysal.examples.get_path("columbus.shp"),k=3,idVariable="POLYID")
>>> set([4,3,2]) == set(wc3_1.neighbors[1])
True
>>> wc3_1.weights[2]
[1.0, 1.0, 1.0]
>>> set([4,1,8]) == set(wc3_1.neighbors[2])
True


Point shapefile

>>> w=knnW_from_shapefile(pysal.examples.get_path("juvenile.shp"))
>>> w.pct_nonzero
1.1904761904761905
>>> w1=knnW_from_shapefile(pysal.examples.get_path("juvenile.shp"),k=1)
>>> "%.3f"%w1.pct_nonzero


Notes

Ties between neighbors of equal distance are arbitrarily broken.

pysal.weights.KNN, pysal.weights.W

reweight(k=None, p=None, new_data=None, new_ids=None, inplace=True)[source]

Redo K-Nearest Neighbor weights construction using given parameters

Parameters: new_data (np.ndarray) – an array containing additional data to use in the KNN weight new_ids (list) – a list aligned with new_data that provides the ids for each new observation inplace (bool) – a flag denoting whether to modify the KNN object in place or to return a new KNN object k (int) – number of nearest neighbors p (float) – Minkowski p-norm distance metric parameter: 1<=p<=infinity 2: Euclidean distance 1: Manhattan distance Ignored if the KDTree is an ArcKDTree A copy of the object using the new parameterization, or None if the object is reweighted in place.
class pysal.weights.Distance.Kernel(data, bandwidth=None, fixed=True, k=2, function='triangular', eps=1.0000001, ids=None, diagonal=False)[source]

Spatial weights based on kernel functions.

Parameters: data (array) – (n,k) or KDTree where KDtree.data is array (n,k) n observations on k characteristics used to measure distances between the n objects bandwidth (float) – or array-like (optional) the bandwidth for the kernel. fixed (binary) – If true then . If false then bandwidth is adaptive across observations. k (int) – the number of nearest neighbors to use for determining bandwidth. For fixed bandwidth, where is a vector of k-nearest neighbor distances (the distance to the kth nearest neighbor for each observation). For adaptive bandwidths, diagonal (boolean) – If true, set diagonal weights = 1.0, if false (default), diagonals weights are set to value according to kernel function. function ({'triangular','uniform','quadratic','quartic','gaussian'}) – kernel function defined as follows with triangular uniform quadratic quartic gaussian eps (float) – adjustment to ensure knn distance range is closed on the knnth observations
weights

dict – Dictionary keyed by id with a list of weights for each neighbor

neighbors

dict – of lists of neighbors keyed by observation id

bandwidth

array – array of bandwidths

Examples

>>> points=[(10, 10), (20, 10), (40, 10), (15, 20), (30, 20), (30, 30)]
>>> kw=Kernel(points)
>>> kw.weights[0]
[1.0, 0.500000049999995, 0.4409830615267465]
>>> kw.neighbors[0]
[0, 1, 3]
>>> kw.bandwidth
array([[ 20.000002],
[ 20.000002],
[ 20.000002],
[ 20.000002],
[ 20.000002],
[ 20.000002]])
>>> kw15=Kernel(points,bandwidth=15.0)
>>> kw15[0]
{0: 1.0, 1: 0.33333333333333337, 3: 0.2546440075000701}
>>> kw15.neighbors[0]
[0, 1, 3]
>>> kw15.bandwidth
array([[ 15.],
[ 15.],
[ 15.],
[ 15.],
[ 15.],
[ 15.]])


>>> bw=[25.0,15.0,25.0,16.0,14.5,25.0]
>>> kwa=Kernel(points,bandwidth=bw)
>>> kwa.weights[0]
[1.0, 0.6, 0.552786404500042, 0.10557280900008403]
>>> kwa.neighbors[0]
[0, 1, 3, 4]
>>> kwa.bandwidth
array([[ 25. ],
[ 15. ],
[ 25. ],
[ 16. ],
[ 14.5],
[ 25. ]])


>>> kwea=Kernel(points,fixed=False)
>>> kwea.weights[0]
[1.0, 0.10557289844279438, 9.99999900663795e-08]
>>> kwea.neighbors[0]
[0, 1, 3]
>>> kwea.bandwidth
array([[ 11.18034101],
[ 11.18034101],
[ 20.000002  ],
[ 11.18034101],
[ 14.14213704],
[ 18.02775818]])


Endogenous adaptive bandwidths with Gaussian kernel

>>> kweag=Kernel(points,fixed=False,function='gaussian')
>>> kweag.weights[0]
[0.3989422804014327, 0.2674190291577696, 0.2419707487162134]
>>> kweag.bandwidth
array([[ 11.18034101],
[ 11.18034101],
[ 20.000002  ],
[ 11.18034101],
[ 14.14213704],
[ 18.02775818]])


Diagonals to 1.0

>>> kq = Kernel(points,function='gaussian')
>>> kq.weights
{0: [0.3989422804014327, 0.35206533556593145, 0.3412334260702758], 1: [0.35206533556593145, 0.3989422804014327, 0.2419707487162134, 0.3412334260702758, 0.31069657591175387], 2: [0.2419707487162134, 0.3989422804014327, 0.31069657591175387], 3: [0.3412334260702758, 0.3412334260702758, 0.3989422804014327, 0.3011374490937829, 0.26575287272131043], 4: [0.31069657591175387, 0.31069657591175387, 0.3011374490937829, 0.3989422804014327, 0.35206533556593145], 5: [0.26575287272131043, 0.35206533556593145, 0.3989422804014327]}
>>> kqd = Kernel(points, function='gaussian', diagonal=True)
>>> kqd.weights
{0: [1.0, 0.35206533556593145, 0.3412334260702758], 1: [0.35206533556593145, 1.0, 0.2419707487162134, 0.3412334260702758, 0.31069657591175387], 2: [0.2419707487162134, 1.0, 0.31069657591175387], 3: [0.3412334260702758, 0.3412334260702758, 1.0, 0.3011374490937829, 0.26575287272131043], 4: [0.31069657591175387, 0.31069657591175387, 0.3011374490937829, 1.0, 0.35206533556593145], 5: [0.26575287272131043, 0.35206533556593145, 1.0]}

classmethod from_array(array, **kwargs)[source]

Construct a Kernel weights from an array. Supports all the same options as pysal.weights.Kernel

pysal.weights.Kernel, pysal.weights.W

classmethod from_dataframe(df, geom_col='geometry', ids=None, **kwargs)[source]

Make Kernel weights from a dataframe.

Parameters: df (pandas.dataframe) – a dataframe with a geometry column that can be used to construct a W object geom_col (string) – column name of the geometry stored in df ids (string or iterable) – if string, the column name of the indices from the dataframe if iterable, a list of ids to use for the W if None, df.index is used.

pysal.weights.Kernel, pysal.weights.W

classmethod from_shapefile(filepath, idVariable=None, **kwargs)[source]

Kernel based weights from shapefile

Parameters: shapefile (string) – shapefile name with shp suffix idVariable (string) – name of column in shapefile’s DBF to use for ids Kernel Weights Object

pysal.weights.Kernel, pysal.weights.W

class pysal.weights.Distance.DistanceBand(data, threshold, p=2, alpha=-1.0, binary=True, ids=None, build_sp=True, silent=False)[source]

Spatial weights based on distance band.

Parameters: data (array) – (n,k) or KDTree where KDtree.data is array (n,k) n observations on k characteristics used to measure distances between the n objects threshold (float) – distance band p (float) – Minkowski p-norm distance metric parameter: 1<=p<=infinity 2: Euclidean distance 1: Manhattan distance binary (boolean) – If true w_{ij}=1 if d_{i,j}<=threshold, otherwise w_{i,j}=0 If false wij=dij^{alpha} alpha (float) – distance decay parameter for weight (default -1.0) if alpha is positive the weights will not decline with distance. If binary is True, alpha is ignored ids (list) – values to use for keys of the neighbors and weights dicts build_sp (boolean) – True to build sparse distance matrix and false to build dense distance matrix; significant speed gains may be obtained dending on the sparsity of the of distance_matrix and threshold that is applied silent (boolean) – By default PySAL will print a warning if the dataset contains any disconnected observations or islands. To silence this warning set this parameter to True.
weights

dict – of neighbor weights keyed by observation id

neighbors

dict – of neighbors keyed by observation id

Examples

>>> points=[(10, 10), (20, 10), (40, 10), (15, 20), (30, 20), (30, 30)]
>>> wcheck = pysal.W({0: [1, 3], 1: [0, 3], 2: [], 3: [0, 1], 4: [5], 5: [4]})
WARNING: there is one disconnected observation (no neighbors)
Island id:  [2]
>>> w=DistanceBand(points,threshold=11.2)
WARNING: there is one disconnected observation (no neighbors)
Island id:  [2]
>>> pysal.weights.util.neighbor_equality(w, wcheck)
True
>>> w=DistanceBand(points,threshold=14.2)
>>> wcheck = pysal.W({0: [1, 3], 1: [0, 3, 4], 2: [4], 3: [1, 0], 4: [5, 2, 1], 5: [4]})
>>> pysal.weights.util.neighbor_equality(w, wcheck)
True


inverse distance weights

>>> w=DistanceBand(points,threshold=11.2,binary=False)
WARNING: there is one disconnected observation (no neighbors)
Island id:  [2]
>>> w.weights[0]
[0.10000000000000001, 0.089442719099991588]
>>> w.neighbors[0]
[1, 3]
>>>


gravity weights

>>> w=DistanceBand(points,threshold=11.2,binary=False,alpha=-2.)
WARNING: there is one disconnected observation (no neighbors)
Island id:  [2]
>>> w.weights[0]
[0.01, 0.0079999999999999984]


Notes

This was initially implemented running scipy 0.8.0dev (in epd 6.1). earlier versions of scipy (0.7.0) have a logic bug in scipy/sparse/dok.py so serge changed line 221 of that file on sal-dev to fix the logic bug.

classmethod from_array(array, threshold, **kwargs)[source]

Construct a DistanceBand weights from an array. Supports all the same options as pysal.weights.DistanceBand

pysal.weights.DistanceBand, pysal.weights.W

classmethod from_dataframe(df, threshold, geom_col='geometry', ids=None, **kwargs)[source]

Make DistanceBand weights from a dataframe.

Parameters: df (pandas.dataframe) – a dataframe with a geometry column that can be used to construct a W object geom_col (string) – column name of the geometry stored in df ids (string or iterable) – if string, the column name of the indices from the dataframe if iterable, a list of ids to use for the W if None, df.index is used.

pysal.weights.DistanceBand, pysal.weights.W
classmethod from_shapefile(filepath, threshold, idVariable=None, **kwargs)[source]