20  Structuring the story


Settling in





20.1 Warm-up

You will tell your data story, with the above components, in both a written report and an oral presentation. Let’s consider the written report.

  • We discussed the general content last class.
  • Let’s talk about the general structure today





Report: aesthetics

Aesthetics are important to telling a story with data! They can help to…

  • set a tone (is the report entertaining? serious? scientific?…)
  • keep & direct a reader’s attention (what might confuse or distract readers?)
  • establish professionalism / trust




EXAMPLE 1: Bad

Name 3 things wrong about the aesthetics below.

How long, in feet, does it take a car to come to a full stop? Consider an example:

library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.5
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ ggplot2   3.5.1     ✔ tibble    3.2.1
✔ lubridate 1.9.3     ✔ tidyr     1.3.1
✔ purrr     1.0.2     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
cars
   speed dist
1      4    2
2      4   10
3      7    4
4      7   22
5      8   16
6      9   10
7     10   18
8     10   26
9     10   34
10    11   17
11    11   28
12    12   14
13    12   20
14    12   24
15    12   28
16    13   26
17    13   34
18    13   34
19    13   46
20    14   26
21    14   36
22    14   60
23    14   80
24    15   20
25    15   26
26    15   54
27    16   32
28    16   40
29    17   32
30    17   40
31    17   50
32    18   42
33    18   56
34    18   76
35    18   84
36    19   36
37    19   46
38    19   68
39    20   32
40    20   48
41    20   52
42    20   56
43    20   64
44    22   66
45    23   54
46    24   70
47    24   92
48    24   93
49    24  120
50    25   85

How does a car’s stopping distance depend upon its speed?

ggplot(cars, aes(y = dist, x = speed)) + 
  geom_point() + 
  geom_smooth() + 
  labs(x = "speed (mph)", y = "stopping distance (ft)") + 
  theme_bw()
`geom_smooth()` using method = 'loess' and formula = 'y ~ x'





EXAMPLE 2: Changing themes

You can change the aesthetic theme of your html!

title: "Our title"
author: "Our names"
format:
  html:
    theme: PUT THEME NAME HERE
    toc: true
    toc-depth: 2
    embed-resources: true





EXAMPLE 3: Plot themes

In addition to using meaningful color blind friendly palettes, meaningful axis labels (and altogether avoiding the use of R variable names) and captions or titles, we can also change the theme of our ggplots:

https://yutannihilation.github.io/allYourFigureAreBelongToUs/ggthemes/

https://github.com/dhmontgomery/personal-work/tree/master/theme-mpr

https://bbc.github.io/rcookbook/





EXAMPLE 4: Very extra

You can use the closeread extension to do some “scrollytelling”:

https://closeread.dev/gallery/

CAVEAT: I haven’t played around with this, so you’d be teaching and troubleshooting on your own!





Aesthetics: do’s and don’ts

Do NOT

  • Print out code messages and warnings.
  • Print anything long (eg: entire datasets).
    • A reader shouldn’t have to scroll to find the relevant pieces of your report!
    • Many reporting outlets (eg: newspapers, magazines, blogs) have size limits.
  • Include plots that are either too small or too big – some plots should be different sizes!
  • Use gratuitous, distracting themes or change up the aesthetics throughout the report.

DO

  • At the top of your qmd, before any text, include a setup chunk similar to this one which sets up some general aesthetics:

    
    ```{r include = FALSE}         # Don't show readers this chunk
    knitr::opts_chunk$set(
      warning = FALSE,             # Don't print out warnings
      message = FALSE,             # Don't print out messages
      collapse = TRUE,             # Show code & output in a single block
      fig.height = 3,              # By default make plots 3" tall
      fig.width = 5,               # By default make plots 5" wide
      fig.align = 'center')        # By default center plots on the page
    ```
  • When necessary, change any plot size at the top of the chunk with the plot code by replacing the ??? with the number of inches. Eg: fig.width = 4 would make the plot 4in wide.

    {r fig.width = ???, fig.height = ???}

  • Make sure that your aesthetic choices have a purpose and use them consistently.





Report: Writing about data

More important than the aesthetics is how we write about data!

NOTE: The below text and examples are adapted from Communicating with Data by Nolan and Stoudt.





EXAMPLE 5: Fix it

Clear, concise language is important when discussing data. Consider some examples:

  • Example 1
    • Bad: “In this part of our analysis, we assume that flight delays that last shorter than 15.155598 minutes have minimal effects on passengers, and so we reduce our large dataset into a smaller subset in which all departure delays are at least fifteen minutes long.”
    • Better: “Since short departure delays have minimal impact on travelers, we analyzed only those flights where the delay was longer than 15 minutes.”
  • Example 2
    • Bad: “Thanks to my model’s output I was able to determine that there is a significant relationship between a child’s weight and their height.”
    • Better: “The model output showed a significant relationship between a child’s weight and height.”

Fix the phrases below.

  1. This research paper has the aim to investigate the physical differences between penguin species.
  2. It should be pointed out that, on average, Gentoo penguins have longer flippers than Adelie penguins.
  3. These demonstrate that they weigh more on average.





Crafting sentences

No matter your audience, the following are just some basics to follow when writing about data:

  • Remove empty phrases that contain no information (e.g. “it should be pointed out that”).
  • Use active, strong verbs instead of passive verbs (e.g. “the research program has the aim to develop” to “the research program will pursue”).
  • Use concrete nouns (avoid imprecise pronouns “it” or “this”).
  • Don’t use long, convoluted sentences.
  • With data, only use precision when it matters. Otherwise, round.
  • Check for subject and verb agreement (eg: “the cats are” not “the cats is”).
  • Use consistent verb tense.





20.2 Exercises

Work on Project Milestone 4.





20.3 Wrap-up

  • Project Milestone 4 is due next Tuesday by 11:59pm.
  • Remember that attendance is critical to your collaborative efforts! If you miss class, you must discuss this with your group and make up for your lacking contribution to groupwork on your own time.