14 Mid-semester review
Settling in
You won’t type anything today! For future reference you can access the following:
- The activity qmd file which contains some review topics (but no questions).
- The mid-semester review file which contains some practice questions (that are NOT an exhuastive review).
Review the basics of wrangling and visualization
14.1 Warm-up

Recall the major themes of our work thus far:
Getting to know your data
- key functions:
head(),dim(),nrow(),class(),str()
Data visualization
- picking an appropriate plot / evaluating the appropriateness of a plot
- interpreting the results of a plot
- building effective plots that are accessible and “professional”
- key functions:
ggplot()geom_bar(),geom_density(),geom_boxplot(),geom_histogram(),geom_point(),geom_line(),geom_smooth()facet_wrap()
- key features:
color,fillfig.alt,fig.caption
Data preparation: wrangling data
- goals
- wrangle our data
- obtain numerical summaries of our data
- key functions
arrange()our data in a meaningful order- subset the data to only
filter()the rows andselect()the columns of interest mutate()existing variables and define new variablessummarize()various aspects of a variable, both overall and by group (group_by())count()up the number of instances of an outcome or set of outcomes
Data preparation: reshaping data
- goal: reshape our data to fit the task at hand
- functions:
pivot_longer()pivot_wider()
Data preparation: joining data
- goal: join different datasets into one
- functions
- mutating joins which combine columns of different datasets:
left_join(),inner_join(),full_join() - filtering joins which filter rows according to membership / non-membership in another dataset:
semi_join(),anti_join()
- mutating joins which combine columns of different datasets:
Data preparation: working with factor variables
- goals
- turn character variables into factor variables (when necessary)
- turn factor variables into more meaningful factor variables
- key functions
- reordering categories / levels:
fct_relevel(),fct_reorder() - change category labels:
fct_recode()
- reordering categories / levels:
Data preparation: working with strings
- goal: detect, replace, or extract certain patterns from character strings
- key functions
- return a modified string:
str_replace(),str_replace_all(),str_to_lower(),str_sub() - return a set of TRUE/FALSE:
str_detect() - return a number:
str_length()
- return a modified string:
IMPORTANT
A list of these key functions will be provided to you on the quiz, without the corresponding context. That list will appear something like this:
ggplotfunctions
ggplot(),geom_bar(),geom_boxplot(),geom_density(),geom_histogram(),geom_line(),geom_point(),geom_smooth(),facet_wrap()- wrangling functions
arrange(),count(),filter(),group_by(),mutate(),select(),summarize() pivot_functions
pivot_longer(),pivot_wider()_joinfunctions
anti_join(),full_join(),inner_join(),left_join(),semi_join()fct_functions
fct_recode(),fct_relevel(),fct_reorder()str_functions
str_detect(),str_length(),str_replace(),str_replace_all(),str_sub(),str_to_lower()
TODAY
We’ll practice SOME of these concepts today. Important caveats:
- This activity is NOT an exhaustive review – it doesn’t cover every topic or every type of question you’ll be asked. For example, it overemphasizes older material as it’s less fresh.
- Be kind to yourself! If you haven’t started studying / reviewing yet, this might feel bumpy.
14.2 Part 1: What’s the verb?
Goal: Review some of the data preparation functions we’ve learned.
Directions: In your group, complete the provided Part 1 activity. Once you’re done, let me know. I’ll then check your answers and give you Part 2.
14.3 Part 2: Quiz practice
Goal: Practice some problems that are more in the style of the quiz questions. These give you a sense of the structure, vibe, and types of questions that might be asked so that none of that comes as a surprise.
Directions: In your group, complete Part 2 of the activity. Once you’ve completed your work, work on Homework 6 (or anything else related to this class).
14.4 Wrap-up
Homework 6 is due tonight by 11:59pm.
Quiz 2 is Tuesday (October 29).
- Study tips:
- Make a study sheet based off of the activities. Though you can’t bring this into the quiz, it’s helpful for studying.
- Study your study sheet!
- Review all checkpoints, activities, and homework (in that order). Try doing the exercises without peeking at solutions. Take note of where you need to spend more time studying.
- Review Quiz 1.
- The following list of functions will be provided to you. It will look something like this:
ggplotfunctions
ggplot(),geom_bar(),geom_boxplot(),geom_density(),geom_histogram(),geom_line(),geom_point(),geom_smooth(),facet_wrap()- wrangling functions
arrange(),count(),filter(),group_by(),mutate(),select(),summarize() pivot_functions
pivot_longer(),pivot_wider()_joinfunctions
anti_join(),full_join(),inner_join(),left_join(),semi_join()fct_functions
fct_recode(),fct_relevel(),fct_reorder()str_functions
str_detect(),str_length(),str_replace(),str_replace_all(),str_sub(),str_to_lower()
- Study tips: