The filter function allows us to only see rows of data that meet a certain condition. We end up with a subset of the original data. The filter() function is part of the dplyr package, which is in the tidyverse dataset.
In this example we’ll look at the palmer penguins data.
library(tidyverse) # load the tidyverse package library(palmerpenguins) View(penguins) dim(penguins) pA = penguins %>% filter(species == "Adelie") dim(pA) pAnot = penguins %>% filter(species != "Adelie") dim(pAnot) p_bl_NA <- penguins %>% filter(is.na(bill_length_mm)) # a data frame where bill length is NA dim(p_bl_NA) View(p_bl_NA)
Suppose we just run the first three lines of code in RStudio. Do that by first creating a new File. In RStudio, in the top menu, go to File, New File, R Script. The shortcut is Ctrl+Shift+N. Paste the above code into your new file. Highlight the first three rows and click the Run button at the top of the Script Editor window.
Pingback: Filtering a Data Set in R - BeginCodingNow.com