library(tidyverse) Welcome to R
STA 199
Today
By the end of today you will…
- begin to know your way around RStudio
- be able to define package, data frame, variable, function, argument
- use the function
glimpse()
Getting started
Clone the ae1 repo from the GitHub organization
R as a calculator
- Use
Ras a calculator by typing the following into the console:
5 * 5 + 10
x = 3
x + x^2
x = 1:10
x * 7
In the last couple examples we save some value as the object “x”.
We can “print” x to the screen by typing the name of the object (“x”) in the console or in a code chunk.
Tour of RStudio
- environment
Rfunctions- loading and viewing a data frame
Load a package
Load data
roster = read_csv("data/roster.csv")
survey = read_csv("data/survey.csv")Question: What objects store the data in the code chunk above? Can you print them to the screen?
Create a new code chunk with CMD+OPTION+I (mac) or CTRL+ALT+I (windows/linux)
So far we’ve already seen two functions. library and read_csv. Functions in R are attached to parentheses and take an input, aka an argument, and often (but not always) return an output. To learn more about a function, you can check the documentation with ?, e.g. ?library.
Demos
Let’s glimpse the data frame.
glimpse(survey)Rows: 12
Columns: 5
$ name <chr> "A", "Appa", "Bumi", "Soka", "Katara", "Suki", "Z…
$ email <chr> "the-last-Rbender@duke.edu", "yip-yip-appa@duke.e…
$ bender <chr> "Airbender", "Airbender", "Earthbender", "None", …
$ previous_programming <chr> "No", "No", "No", "Somewhat", "Yes", "Yes", "Yes"…
$ year <dbl> 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4
To look at all of it, we can use view()
view(survey)View the roster data in the console
Terminology: “columns” of a dataframe are called variables whereas “rows” are observations.
Question: How many variables are in the data frame survey? How many observations? What about the data frame roster?
Why must I input specific email formats?
roster |>
left_join(survey, by = "email")