R Practice - DOE

Author

Dr. Cohen

Tidyverse

  • dyplr: manipulating data.frame

  • purrr: working with functions

  • ggplot2: visualization

Tidy data is defined when each row represents one observation and columns represent variables.

#install.packages("tidyverse")
#intall.packages("dslabs")

library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.2     ✔ readr     2.1.4
✔ forcats   1.0.0     ✔ stringr   1.5.0
✔ ggplot2   3.4.2     ✔ tibble    3.2.1
✔ lubridate 1.9.2     ✔ tidyr     1.3.0
✔ purrr     1.0.1     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(dslabs)

data("murders")

#tidy data
head(murders)
       state abb region population total
1    Alabama  AL  South    4779736   135
2     Alaska  AK   West     710231    19
3    Arizona  AZ   West    6392017   232
4   Arkansas  AR  South    2915918    93
5 California  CA   West   37253956  1257
6   Colorado  CO   West    5029196    65
head(co2)
[1] 315.42 316.31 316.50 317.56 318.13 318.00
data("ChickWeight")
data("relig_income")
# we need to made this data data tiyd using pivot_long()