DAY 20
2
Big Data

## [第 20 天] 資料視覺化（3）Bokeh

Bokeh is a Python interactive visualization library that targets modern web browsers for presentation. Its goal is to provide elegant, concise construction of novel graphics in the style of D3.js, and to extend this capability with high-performance interactivity over very large or streaming datasets. Bokeh can help anyone who would like to quickly and easily create interactive plots, dashboards, and data applications.
Welcome to Bokeh - Bokeh 0.12.3 documentation

Visualizations built on web technologies (that is, JavaScript-based) appear to be the inevitable future.
Wes McKinney

• 直方圖（Histogram）
• 散佈圖（Scatter plot）
• 線圖（Line plot）
• 長條圖（Bar plot）
• 盒鬚圖（Box plot）

``````\$ conda install -c anaconda bokeh=0.12.3
``````

## 直方圖（Histogram）

### Python

``````from bokeh.charts import Histogram, show
import numpy as np

normal_samples = np.random.normal(size = 100000) # 生成 100000 組標準常態分配（平均值為 0，標準差為 1 的常態分配）隨機變數
hist = Histogram(normal_samples)
show(hist)
``````

### R 語言

``````library(ggplot2)
library(plotly)

normal_samples <- rnorm(100000) # 生成 100000 組標準常態分配（平均值為 0，標準差為 1 的常態分配）隨機變數
normal_samples_df <- data.frame(normal_samples)
hist <- ggplot(normal_samples_df, aes(x = normal_samples)) + geom_histogram(aes(y = ..density..)) + geom_density()
ggplotly(hist)
``````

## 散佈圖（Scatter plot）

### Python

``````from bokeh.charts import Scatter, show
import pandas as pd

speed = [4, 4, 7, 7, 8, 9, 10, 10, 10, 11, 11, 12, 12, 12, 12, 13, 13, 13, 13, 14, 14, 14, 14, 15, 15, 15, 16, 16, 17, 17, 17, 18, 18, 18, 18, 19, 19, 19, 20, 20, 20, 20, 20, 22, 23, 24, 24, 24, 24, 25]
dist = [2, 10, 4, 22, 16, 10, 18, 26, 34, 17, 28, 14, 20, 24, 28, 26, 34, 34, 46, 26, 36, 60, 80, 20, 26, 54, 32, 40, 32, 40, 50, 42, 56, 76, 84, 36, 46, 68, 32, 48, 52, 56, 64, 66, 54, 70, 92, 93, 120, 85]

cars_df = pd.DataFrame(
{"speed": speed,
"dist": dist
}
)

scatter = Scatter(cars_df, x = "speed", y = "dist")
show(scatter)
``````

### R 語言

``````library(ggplot2)
library(plotly)

scatter_plot <- ggplot(cars, aes(x = speed, y = dist)) + geom_point()
ggplotly(scatter_plot)
``````

## 線圖（Line plot）

### Python

``````from bokeh.charts import Line, show
import pandas as pd

speed = [4, 4, 7, 7, 8, 9, 10, 10, 10, 11, 11, 12, 12, 12, 12, 13, 13, 13, 13, 14, 14, 14, 14, 15, 15, 15, 16, 16, 17, 17, 17, 18, 18, 18, 18, 19, 19, 19, 20, 20, 20, 20, 20, 22, 23, 24, 24, 24, 24, 25]
dist = [2, 10, 4, 22, 16, 10, 18, 26, 34, 17, 28, 14, 20, 24, 28, 26, 34, 34, 46, 26, 36, 60, 80, 20, 26, 54, 32, 40, 32, 40, 50, 42, 56, 76, 84, 36, 46, 68, 32, 48, 52, 56, 64, 66, 54, 70, 92, 93, 120, 85]

cars_df = pd.DataFrame(
{"speed": speed,
"dist": dist
}
)

line = Line(cars_df, x = "speed", y = "dist")
show(line)
``````

### R 語言

``````library(ggplot2)
library(plotly)

line <- ggplot(cars, aes(x = speed, y = dist)) + geom_line()
ggplotly(line)
``````

## 長條圖（Bar plot）

### Python

``````from bokeh.charts import Bar, show
import pandas as pd

cyls = [11, 7, 14]
labels = ["4", "6", "8"]
cyl_df = pd.DataFrame({
"cyl": cyls,
"label": labels
})

bar = Bar(cyl_df, values = "cyl", label = "label")
show(bar)
``````

### R 語言

``````library(ggplot2)
library(plotly)

bar <- ggplot(mtcars, aes(x = cyl)) + geom_bar()
ggplotly(bar)
``````

## 盒鬚圖（Box plot）

### Python

``````from bokeh.charts import BoxPlot, show, output_notebook
import pandas as pd

output_notebook()

mpg = [21, 21, 22.8, 21.4, 18.7, 18.1, 14.3, 24.4, 22.8, 19.2, 17.8, 16.4, 17.3, 15.2, 10.4, 10.4, 14.7, 32.4, 30.4, 33.9, 21.5, 15.5, 15.2, 13.3, 19.2, 27.3, 26, 30.4, 15.8, 19.7, 15, 21.4]
cyl = [6, 6, 4, 6, 8, 6, 8, 4, 4, 6, 6, 8, 8, 8, 8, 8, 8, 4, 4, 4, 4, 8, 8, 8, 8, 4, 4, 4, 8, 6, 8, 4]
mtcars_df = pd.DataFrame({
"mpg": mpg,
"cyl": cyl
})

box = BoxPlot(mtcars_df, values = "mpg", label = "cyl")
show(box)
``````

### R 語言

``````library(ggplot2)
library(plotly)

box <- ggplot(mtcars, aes(y = mpg, x = factor(cyl))) + geom_boxplot()
ggplotly(box)
``````

## 參考連結

### 1 則留言

1
WeiYuan
iT邦新手 5 級 ‧ 2016-12-20 13:46:23