DAY 3
0
AI & Data

AI & Data_3(機器學習的數學基礎)

• 成長期: 亞古獸、比丘獸
• 成熟期: 惡魔獸、奧加獸

• Minkowski Distance
• Euclidean Distance
• Manhattan Distance
• Chebyshev Distance
• Cosine
• Hamming Distance
• Jaccard Similarity Coefficient

Euclidean Distance

``````import numpy as np
v1 = np.mat([1,2,3])
v2 = np.mat([4,5,6])
print(np.sqrt((v1-v2)*(v1-v2).T))
``````

Manhattan Distance

``````import numpy as np
v1 = np.mat([1,2,3])
v2 = np.mat([4,5,6])
print(np.sum(abs(v1-v2)))
``````

Chebyshev Distance

Chebyshev Distance是指，西洋棋上王要從一個位子(x1, y1)移至另一個位子(x2, y2)需要走的步數，會發現步數為max(|x2-x1|,|y2-y1|)，所以d=max(|x2-x1|,|y2-y1|)。若是兩個n維向量(x11, x12,..., x1n)與(x21, x22,..., x2n)，d=max(|x1i-x2i|)，i從1到n。

``````import numpy as np
v1 = np.mat([1,2,3])
v2 = np.mat([4,5,6])
print(abs(v1-v2).max())
``````

Cosine

``````import numpy as np
v1 = np.mat([1,2,3])
v2 = np.mat([4,5,6])
cosv = np.dot(v1, v2.T)/(np.linalg.norm(v1)*np.linalg.norm(v2))
print(cosv)
``````

Hamming Distance

``````import numpy as np
v = np.mat([[1,1,0,1,0,1,0,0,1], [0,1,1,0,0,0,1,1,1]])
smstr = np.nonzero(v[0] - v[1])
print(len(smstr[0]))
``````

Jaccard Similarity Coefficient

``````import numpy as np
import scipy.spatial.distance as dist
v = np.mat([[1,1,0,1,0,1,0,0,1], [0,1,1,0,0,0,1,1,1]])
print(dist.pdist(v, 'jaccard'))
``````