一樣我們用scikit-learn來做(連結)
scikit-learn裡面有三種貝氏分類器的模型,今天介紹第一個高斯模型~高斯模型的意思是指,在features間的likelihood是以高斯分佈(就是我們熟悉的常態分佈)來假設的。
# -*- coding: utf-8 -*-
from sklearn import tree
from sklearn.model_selection import train_test_split
from sklearn import datasets
from sklearn.metrics import accuracy_score
from sklearn.naive_bayes import GaussianNB
iris = datasets.load_iris() #load進iris的資料庫
features = iris.feature_names
X = iris.data
y = iris.target
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3)
gnb = GaussianNB()
y_result = gnb.fit(X_train, y_train).predict(X_test)
accuracy_score(y_test, y_result)
print("Number of mislabeled points out of a total %d points : %d"% (iris.data.shape[0],(y_test != y_result).sum()))
之後補充多一點XD