DAY 29
1
AI & Data

## Day29 Pysal空間自相關

Tobler地理學第一定律：all attribute values on a geographic surface are related to each other, but closer values are more strongly related than are more distant ones.

`空間自相關`指的是空間單元的數據與其周遭空間單元的相似性，

• spatial weight與attribute weight
• 全域空間自相關
• 區域空間自相關

## spatial weight與attribute weight

``````import geopandas as gpd
light=light[light.is_valid]
light=light[light['district']=='永和區']
light.crs = {'init' :'epsg:3826'}
light=light.reset_index()
village=village[village.is_valid]
village.crs = {'init' :'epsg:3826'}
village=village.reset_index()
result['count']=1
result['avg']=result['count']/result.area
result
``````

``````cl = pysal.Quantiles(result['count'], k=5)
result.assign(cl=cl.yb).plot('cl', legend=True, categorical=True)
``````

``````wq =  libpysal.weights.Queen.from_dataframe(result)
wq.transform = 'r' #  row standardization
``````

``````y = result['count']
ylag = libpysal.weights.lag_spatial(wq, y)
``````

``````cl = pysal.Quantiles(ylag, k=5)
result.assign(cl=cl.yb).plot('cl', legend=True, cmap='GnBu', categorical=True)
``````

`spatial lag`的結果，我們以散布圖與原本的值比較(Moran Scatterplot)

``````import numpy as np
import matplotlib.pyplot as plt
y = result['count']
b, a = np.polyfit(y, ylag, 1)
f, ax = plt.subplots(1, figsize=(9, 9))

plt.plot(y, ylag, '.', color='firebrick')

plt.vlines(y.mean(), ylag.min(), ylag.max(), linestyle='--')
plt.hlines(ylag.mean(), y.min(), y.max(), linestyle='--')

plt.plot(y, a + b*y, 'r')
plt.title('Moran Scatterplot')
plt.ylabel('Spatial Lag of count')
plt.xlabel('count')
plt.show()
``````

### 全域空間自相關

Moran's I的公式說明詳見說明

``````import esda
mi = esda.moran.Moran(y, wq)
mi.I
``````

### 區域空間自相關

(有偏袒，沒有特別偏袒)

``````li = esda.moran.Moran_Local(y, wq)
li.q
``````

``````result.assign(cl=li.q).plot('cl', legend=True, cmap='GnBu', categorical=True)
``````

`Local Moran’I`會算出1-4個等級，

``````sig = li.p_sim < 0.05
hotspot = 1 * (sig * li.q==1)
coldspot = 3 * (sig * li.q==3)
doughnut = 2 * (sig * li.q==2)
diamond = 4 * (sig * li.q==4)
spots = hotspot + coldspot + doughnut + diamond

spot_labels = [ '0 ns', '1 hot spot', '2 doughnut', '3 cold spot', '4 diamond']
labels = [spot_labels[i] for i in spots]
result.assign(cl=labels).plot('cl', legend=True, cmap='GnBu', categorical=True)
``````

hotspot：自相關高，正相關，通過設定的統計測試

``````spots = ['n.sig.', 'hotspot']
labels = [spots[i] for i in hotspot*1]
result.assign(cl=labels).plot('cl', legend=True, cmap='GnBu', categorical=True)
``````

``````spots = ['n.sig.', 'diamond']
labels = [spots[i] for i in diamond*1]
result.assign(cl=labels).plot('cl', legend=True, cmap='GnBu', categorical=True)
``````

## 參考資料

Global Moran's I| ArcGIS for Desktop
ESDA with PySAL
GitHub - pysal/esda: statistics and classes for exploratory spatial data analysis