## PYthon盒鬚圖找離群值

iT邦新手 3 級 ‧ 2020-11-03 20:26:48

Tukey's fences:

-wiki Outlier

|補充|一定說我都只會紙上談兵對不對?哼!這就貼code:

``````import numpy as np
import random
from matplotlib import pyplot as plt
%matplotlib inline
#生成隨機List
array = list(np.random.randint(10,size=100))

#找Q1,Q3
q1, q3 = np.percentile(array, [25, 75])
print(f"Q1 is: {q1}, Q3 is: {q3}\n")

#上界及下界
above = q3 + 1.5 * (q3 - q1)
below = q1 - 1.5 * (q3 - q1)
print(f"Above is: {above}, Below is: {below}\n")

#機率性加入離群值(也就是說加入的可能在範圍內)
outlier = random.randint(-10,20)
index = random.randint(0, 100)
array.insert(index, outlier)
print(f"Outlier is: {outlier}, Index is: {index}\n")

#過濾
array = list(filter(lambda x: x <= above, list(filter(lambda x: x >= below, array))))

print(f"After filter:\n{array}")

#畫圖
plt.boxplot(array)
plt.show()
``````

sky800219 iT邦新手 5 級 ‧ 2020-11-03 21:45:07

R(全距)=Xmax(最大值)-Xmin(最小值)
200多是你Dataset的最大值

sky800219 iT邦新手 5 級 ‧ 2020-11-03 23:08:20

``````plt.boxplot(array,showfliers = False, autorange = True)
``````

sky800219 iT邦新手 5 級 ‧ 2020-11-04 17:35:14

20應該是跟那根黑黑的粗線有關，我記得我之前做的時候也有，但它是什麼我忘了，

``````plt.boxplot(array,showfliers = False, autorange = True)
``````

``````plt.ylim(ymin = -1,ymax = 20)
plt.boxplot(array,showfliers = False)
``````
sky800219 iT邦新手 5 級 ‧ 2020-11-04 22:55:25