Some of you may be surprised to learn that fill and color are different, and that you could use one or both in a given plot.
Color refers to the outline of the object (bar, piechart wedge, etc.), while fill refers to the inside of the object. For scatterplots, the default shape doesn't have a fill, so you'd just use color to change the appearance of those points.
Let's recreate the pages read over 2019 chart, but this time, I'll just use fiction books and separate them as either fantasy or other fiction; this divides that dataset pretty evenly in half. Here's how I'd generate the pages read over time separately by those two genre categories.
library(tidyverse)
reads2019 <- read_csv("~/Downloads/Blogging A to Z/SaraReads2019_allchanges.csv", col_names = TRUE)
fantasy <- reads2019 %>% filter(Fiction == 1) %>% mutate(date_read = as.Date(date_read, format = '%m/%d/%Y'), Fantasy = factor(Fantasy, levels = c(0,1), labels = c("Other Fiction", "Fantasy"))) %>% group_by(Fantasy) %>% mutate(GenreRead = order_by(date_read, cumsum(Pages))) %>% ungroup()
library(scales)
myplot <- fantasy %>% ggplot(aes(date_read, GenreRead, color = Fantasy)) + geom_point() + xlab("Date") + ylab("Pages") + scale_x_date(date_labels = "%b", date_breaks = "1 month") + scale_y_continuous(labels = comma, breaks = seq(0,30000,5000)) + labs(color = "Genre of Fiction")
library(ggthemes)
## Warning: package 'ggthemes' was built under R version 3.6.3
myplot + scale_color_fivethirtyeight() + theme_fivethirtyeight()
The geom_point offers many point shapes; 21-25 allow you to specify both color and fill. But for the rest, only use color.
library(ggpubr)
## Warning: package 'ggpubr' was built under R version 3.6.3
ggpubr::show_point_shapes()
reads2019 %>% mutate(Gender = factor(Gender, levels = c(0,1), labels = c("Male", "Female")), Fiction = factor(Fiction, levels = c(0,1), labels = c("Non-Fiction", "Fiction"), ordered = TRUE)) %>% group_by(Gender, Fiction) %>% summarise(Books = n()) %>% ggplot(aes(Fiction, Books, fill = reorder(Gender, desc(Gender)))) + geom_col() + scale_fill_economist() + xlab("Genre") + labs(fill = "Author Gender")
reads2019 %>% mutate(Gender = factor(Gender, levels = c(0,1), labels = c("Male", "Female")), Fiction = factor(Fiction, levels = c(0,1), labels = c("Non-Fiction", "Fiction"), ordered = TRUE)) %>% group_by(Gender, Fiction) %>% summarise(Books = n()) %>% ggplot(aes(Fiction, Books, fill = reorder(Gender, desc(Gender)))) + geom_col(position = "dodge") + scale_fill_economist() + xlab("Genre") + labs(fill = "Author Gender")
genres <- reads2019 %>% group_by(Fiction, Childrens, Fantasy, SciFi, Mystery) %>% summarise(Books = n()) genres <- genres %>% bind_cols(Genre = c("Non-Fiction", "General Fiction", "Mystery", "Science Fiction", "Fantasy", "Fantasy Sci-Fi", "Children's Fiction", "Children's Fantasy")) genres %>% filter(Genre != "Non-Fiction") %>% ggplot(aes(reorder(Genre, -Books), Books, fill = Genre)) + geom_col() + xlab("Genre") + scale_x_discrete(labels=function(x){sub("\\s", "\n", x)}) + scale_fill_economist() + theme(legend.position = "none")
I hope you enjoyed this series! There's so much more you can do with tidyverse than what I covered this month. Hopefully this has given you enough to get started and sparked your interest to learn more. Once again, I highly recommend checking out R for Data Science.