Fundamentals of R Programming


It’s always good to start with the fundamentals. We’ll start with comments, functions, variables, data types, vectors, and pipes.

Comments

Comments are helpful when you want to describe or explain what’s going on in your code.

Functions

Functions begin with function names like print or paste, and are usually followed by one or more arguments in parentheses. If you want to find out more about the print function or any function, all you have to do is type a question mark, the function name, and a set of parentheses. Keep in mind that functions are case-sensitive.

Variables

Data Types

Vectors

Pipes

A pipe is a tool in R for expressing a sequence of multiple operations. A pipe is represented by a % sign, followed by a > sign, and another % sign. When you see %>% in code you can read that as “and then”.

Example Script

Below is a very simple example script to try in RStudio. Notice what happens in the console. Also in the Environment pane you will see friends. If you click on that, the data frame will appear in the script pane.

#'[a comment is here] 
5 + 6
a <- 3 + 4
a
names <- c("Ross", "Robert")
born <- c(1954, 1959)
friends <- data.frame(names, born)
View(friends)
str(friends)
friends$names
friends$born

Packages

By default, R includes a set of packages called Base R that are available to use in RStudio when you start your first programming session. There’s also recommended packages that are loaded but not installed. To make the most of R for your data analysis, you will need to install packages. Packages are units of reproducible R code that you can use to add more functionality to R. The best part is that the R community creates and shares packages so that other users can access them! Packages in R include reusable R functions, documentation about how to use the functions, sample datasets, and tests for checking your code. At the top of the list of packages might be Tidyverse.

Run the command installed.packages(). Let’s focus on the package and priority columns. The package column gives the name of the package like cluster or graphics. The priority column tells us what’s needed to use functions from the package. If you come across the word “base” in the priority column, then the package is already installed and loaded. You can use all of the functions of that package as soon as you open RStudio. If you find the word “recommended,” then the package is installed but not loaded.

You can find thousands of R packages just by doing an online search. One of the most commonly used sources of packages is CRAN. CRAN stands for Comprehensive R Archives Network. It’s an online archive with R packages, source code, manuals and documentation.

In tidyverse, there’s four packages that are an essential part of the workflow for data analysts: ggplot2, dplyr, tidyr and readr.

Resources for Learning R

Here is an online book Modern Statistics with R.

Check out the online book R for Data Science.

A YouTube video called Learn R in 39 minutes by Equitable Equations.

Here’s a website for beginners called RYouWithMe. It’s by the R-Ladies in Sydney.

Leave a comment

Your email address will not be published. Required fields are marked *