iT邦幫忙

2018 iT 邦幫忙鐵人賽
DAY 4
0
Data Technology

職場老鳥的資料科學初體驗-R語言專案實作紀錄系列 第 4

(Day4)tidyverse(上)-簡介

根據Hadley Wickhamy 在tidyverse等說明,tidyverse有兩層基本含義:
(1)基於google.github.io 定義的代碼Style(Google’s R style guide)衍生的一種使代碼清晰簡潔易讀的編程風格;
(2)一系列基於簡潔風格而編寫的R Package套件。2017/11/15 tidyverse 1.2.0
c一詞中的tidy意為整潔,verse意為詩篇、詩行,合起來意指代碼或數據如詩行般整潔易讀,即成為“整潔代碼”(tidy code)或“整潔數據”(tidy data)。熟悉這一風格和相關 R 包,可使數據處理和代碼編寫過程更為便捷高效,且易於與其他數據分析者交流溝通。
資料科學整體作業流程:
https://ithelp.ithome.com.tw/upload/images/20171225/20107033KZy3NLue9R.png

Tidyverse packages

Installation and use

  • 安裝套件為:install.packages("tidyverse")
    主要是為tidyverse一詞所定義,涵蓋日常處理作業循環中所需的Package.
    https://ithelp.ithome.com.tw/upload/images/20171223/20107033bKvLWdLhKi.png
  • 叫用library: library(tidyverse)
    叫用結果為:
    https://ithelp.ithome.com.tw/upload/images/20171223/20107033TNas7MF8lP.png

Core tidyverse:

已涵蓋的library 無需另行叫用
https://ithelp.ithome.com.tw/upload/images/20171224/20107033UEcXebW6z1.png

其他library()需各自叫用.

資料來源處理作業 Import

  • readxl for .xls and .xlsx sheets.
  • haven for SPSS, Stata, and SAS data.
  • jsonlite for JSON.
  • xml2 for XML.
  • httr for web APIs.
  • rvest for web scraping.
  • DBI for relational databases. To connect to a specific database, you’ll need to pair DBI with a specific backend like RSQLite, RPostgres, or odbc. Learn more at http://db.rstudio.com.

資料清理作業 tidy/Wrangle

  • stringr for strings.
  • lubridate for dates and date-times.
  • forcats for categorical variables (factors).
  • hms for time-of-day values.
  • blob for storing blob (binary) data.

開發作業 Program

  • rlang provides tools to work with core language features of R and the tidyverse
  • magrittr provides the pipe, %>% used throughout the tidyverse. It also provide a number of more specialised piping operators (like %$% and %<>%) that can be useful in other places.
  • glue provides an alternative to paste() that makes it easier to combine data and strings.

Model

參考資訊

The tidyverse style guide
Learn the tidyverse-R for data science
CRAN-tidyverse: Easily Install and Load the 'Tidyverse'


上一篇
(Day3)Hello World!
下一篇
(Day5)tidyverse(中)-整潔易讀Style
系列文
職場老鳥的資料科學初體驗-R語言專案實作紀錄30

尚未有邦友留言

立即登入留言