Unit overview

Motivation

Thus far, we’ve been building models of one data variable: \(Y\). However, we typically want to understand the relationship between \(Y\) and a set of predictors \((X_1, X_2, \ldots, X_p)\). For example, we might wish to understand how a person’s belief in climate change, \(Y\), relates to their age (\(X_1\)) and years of education (\(X_2\)).


Regression vs classification

We’ll conventionally refer to the study and prediction of \(Y\) using information from predictors \((X_1, X_2, \ldots, X_p)\) as:

  • regression (when \(Y\) is quantitative)
  • classification (when \(Y\) is categorical)