In the Asynchronous Lecture
R
programming basics:
for
and while
loops;reprex
and datapasta
In the Synchronous Lecture
RMarkdown
to generate reproducible documents.If you have any questions while watching the pre-recorded material, be sure to write them down and to bring them up during the synchronous portion of the lecture.
The following tabs contain pre-recorded lecture materials for class this week. Please review these materials prior to the synchronous lecture.
Total time: Approx. 1 hour and 10 minutes
for
Loops# Indexing
fruit <- c("bananas","apples","grapes","pears","oranges")
length(fruit) # See how many entries do we have
fruit[1] # Extracting the first item
# Say we wanted to print off each item in the vector.
cat("I like",fruit[1])
cat("I like",fruit[2])
cat("I like",fruit[3])
cat("I like",fruit[4])
cat("I like",fruit[5])
# Note how we're systematically counting through the index positions to access
# each item. This is laborious to do by hand, especially if this data object was
# really big. Fear not! This is where the loop comes in.
# For Loops
for( i in fruit ){ # i takes on the value of fruit one at a time.
cat("I like",i,"\n")
}
# What's happening with i?
i <- fruit[1]
i <- fruit[2]
# ...
i <- fruit[5]
# Now let's use the for loop to iterate, using the index as a pointer
# Create an array of numbers
numbers = c(10,100,-1000,345,-7,999,21345,444457)
# Iterate from the 2nd index until the last
for( i in 2:length(numbers) ){
# Multiply the two numbers together
new_number <- numbers[i]*numbers[i-1]
print(new_number) # print the numbers
}
# Note finally there is no special rule that the pointer has to be `i`. It can
# be anything that is a valid object name. For example, let's trade `i` for `my_favorite_fruit`
for( my_favorite_fruit in fruit ){
cat("I like",my_favorite_fruit,"\n")
}
while
Loops# (1) Using a container to store data values
dat_container <- as.data.frame(matrix(0, nrow = 5,ncol = 2))
dat_container
for( i in 1:5){
dat_container[i,1] <- i
dat_container[i,2] <- letters[i]
}
dat_container
# (2) Binding output each iteration
containter2 <- c()
for( i in 1:10){
tmp_data <- data.frame(v1=i,v2=letters[i])
containter2 <- rbind(containter2,tmp_data)
}
containter2
# Which is faster?
# Using a container
require(tictoc) # for counting how long it takes R to run
N = 5000
tic()
dat_container <- as.data.frame(matrix(0, nrow = N,ncol = 2))
for( i in 1:N){
dat_container[i,1] <- i
dat_container[i,2] <- 1/i
}
toc()
# Building container
N = 5000
tic()
dat_container <- c()
for( i in 1:N){
tmp <- data.frame(V1 = i,V2 = 1/i)
dat_container <- rbind(dat_container,tmp)
}
toc()
if(TRUE){ # Requires a condition
print("Hello") # Code if the condition is True
}else{
print("Goodbye") # Code if the condition is False
}
# Control structures + Loop
for(i in 1:10){
if(i <= 5){
print("Hello")
}else{
print("Goodbye")
}
}
# If-else vectorization
x <- 1:10
ifelse(x<5,"Hello","Goodbye")
# When code breaks --------------------------------------------------------
# Here is an example of a code chunk with a problem.
my_data <- cars
my_data$new_var <- NA # create a new variable
for( i in 1:nrow(my_data)){
# using speed, categorize cars as fast/slow
if(my_data$speed[1] >= mean(my_data$speed)){
my_data$new_var[i] <- "fast"
}else{
my_data$new_var[i] <- "slow"
}
}
# Print off
head(my_data)
# This doesn't look right, all the cars are slow!
# Generating a reproducible example of the code ---------------------------------------
require(reprex) # Load the package
reprex() # Run the function
# Generating a reproducible example of the data ---------------------------------------
require(datapasta) # Load the package
# Small example
ex_dat <- head(cars)
# To save output to clipboard
df_format(ex_dat)
tribble_format(ex_dat)
# To instantly paste output
df_paste(ex_dat)
tribble_paste(ex_dat)
# All together now --------------------------------------------------------
# Small representative example of the data
ex_dat <- head(cars)
tribble_format(ex_dat)
# Tailor the small code example
# Data
my_data <- tibble::tribble(
~speed, ~dist,
4, 2,
4, 10,
7, 4,
7, 22,
8, 16,
9, 10
)
# Code
my_data$new_var <- NA # create a new variable
for( i in 1:nrow(my_data)){
# using speed, categorize cars as fast/slow
if(my_data$speed[1] >= mean(my_data$speed)){
my_data$new_var[i] <- "fast"
}else{
my_data$new_var[i] <- "slow"
}
}
# Print off output
head(my_data)
# Copy the above and then run reprex()
reprex()
The following survey asks you quick questions regarding the usefulness of the asynchronous lecture materials. Feedback will be used to modify aspect of the asynchronous materials moving forward.
These exercises are designed to help you reinforce your grasp of the concepts covered in the asynchronous lecture material.
Write a loop that iterates over the following two vectors, and multiplies the values located at each index location. So, for example, for the first iteration 3
will be multiplied by 3.3
, for the second 5
by 4.2
and so on. Print off the answer each iteration.
Write a loop that iterates through the mtcars
data frame (which comes built into R
) and print the message “Low Gear” if the gear variable is less than 4, else print “High Gear”.
The following materials were generated for students enrolled in PPOL670. Please do not distribute without permission.
ed769@georgetown.edu | www.ericdunford.com