In our previous post, we cleaned and merged data collected from multiple sources. Now that we have a properly structured data table, we can proceed to the visualization stage.
By changing the metrics used in the post titled “R with Google Analytics Reporting API Access”, you can generate different tables and compare the data presented here either together or across separate tables.
R: Data Visualization
Using functions such as plot, barplot, hist, pie, dotchart, and others, we can easily create statistical graphics specified by the function name. The most basic usage can be illustrated as follows. For additional examples, please refer to the links under the Further Reading section at the end of the article.
plot(x, y, col = "<color>", axes = TRUE)
title(main = "<title>", col.main = "<color>")
box()
Over time, these simple visualization techniques may become increasingly complex and difficult to manage and customize. For this reason, ggplot2 has emerged as one of the most widely used packages1. Additionally, other options such as Plotly2, tufte3, lattice4 can also be considered depending on specific requirements.
I will provide a more detailed article on ggplot2. However, for now, I will keep the scope of this post focused and concise.
R: ggplot2 Package
Yes, we currently have data such as page URLs, daily view counts, session durations, bounce rates, page types, categories it belongs to, and publication dates. Excluding the page URL, we can group the date information by weekly, monthly, and yearly intervals. Based on these values (variables), we can then explore the following visualization operations as needed.
First, let’s examine how page view counts change over time.
ggplot(data = <data>, mapping = aes(x = data$x, y = data$y)) + geom_line()
We can create a simple line plot with the code line above. For a more concise and practical format, we can update our code as follows.
ggplot(data, aes(x, y)) + geom_line()
The part of the code line that determines the type of plot is geom_line(). By changing this function, we can easily switch between different types of plots. We can also add multiple plots to the same code line using the + operator. In this case, each plot can be processed using the same data (mapping).
ggplot(data = gaData, mapping = aes(x = date)) +
geom_line(aes(y = pageviews), col = "red") +
geom_line(aes(y = sessions), col = "blue")
The outputs of the code lines mentioned above, from left to right, will appear as follows.
In the next step, we can perform groupings, facet the plots horizontally or vertically, add trend lines, and other modifications to enable interpretation of the plots according to our objectives.
For example, let’s compare page view counts obtained from organic and paid traffic.
gaPageData.bsc %>%
filter(date >= ymd("2020-02-01"), date < ymd("2020-03-01")) %>%
select(date, organicPageviews, paidPageviews, organicbounceRate, paidbounceRate) %>%
ggplot(aes(date)) +
geom_line(aes(y = organicPageviews), color = "red") +
geom_line(aes(y = paidPageviews), color = "blue") +
ggtitle("Page Views by Traffic Source") +
labs(x = "February 2020",
y = "Page Views",
subtitle = "Organic vs Paid Traffic")
R: ggplot2 Theme Customizations
Finally, I’d like to mention the theme() function. Our theme() layer allows us to customize text and graphical elements (such as plot, panel, grid, etc.) within the plot. Thanks to this layer, we can easily modify the visual appearance of the plots.
You can see the corresponding ggplot code below. Step by step, I first defined the color variables. I recommend using the hihayk/scale tool for color palettes5. Then, I filtered my gaData dataset and began constructing the ggplot layers6, progressing by passing the ggplot object (object) along. In the second step, I move on to customizing the theme. Generally, I prefer to proceed from the most structural element (plot) down to the most basic (legend). Thus, the customizations applied to the plot appear in the first few lines. In the final step, I use labs to specify the plot’s titles and labels.
tcolor <- c("#474e4b",
"#7a8785",
"#acbec1",
"#c5d1c3",
"#607494",
"#f3f0f3",
"#ffffff") # https://hihayk.github.io/scale/
gg <- gaData %>%
filter(date >= ymd("2020-02-01"), date < ymd("2020-03-01")) %>%
ggplot(aes(date, pageviews, col = factor(source))) +
geom_point() +
stat_smooth(method = "lm", se = FALSE)
gg +
theme(plot.margin = unit(c(1, 1, 1, 1), "cm"),
plot.background = element_rect(fill = tcolor[7]),
panel.background = element_rect(fill = tcolor[6]),
panel.grid.major = element_line(colour = tcolor[4], linetype = "dotted", size = 0.4),
panel.grid.minor = element_line(colour = tcolor[4]),
plot.title = element_text(color = tcolor[1], vjust = -7),
plot.subtitle = element_text(color = tcolor[2], vjust = -10),
plot.caption = element_text(color = tcolor[4], vjust = -5),
axis.title.x = element_text(color = tcolor[3], vjust = -3),
axis.title.y = element_text(color = tcolor[3], vjust = 5),
axis.text.x = element_text(color = tcolor[5], angle = 90, vjust = 5),
axis.text.y = element_text(color = tcolor[5]),
legend.title = element_text(size = 10, color = tcolor[2]),
legend.background = element_blank(),
legend.key = element_blank(),
legend.position = "top", legend.box = "horizontal", legend.justification = c(1, 0))
gg + labs(color = 'Traffic Source',
x = 'February 2020',
y = 'Page View Count',
title = 'Monthly Page View Change by Page',
subtitle = '/landing-age-uri',
caption = 'Data source: Google Analytics')
You can store a created theme content under a different variable name and reuse it across different plots without having to redefine it each time. Alternatively, you can also save the theme elements as a new theme and apply it to plots using just a function name.
That’s all for now. In the next post, I’ll cover how to perform directory operations, and how to save data and images in R. Following the basic operations, I’ll go through all these concepts again with a real-world case study. To stay updated with future posts, please use the newsletter sign-up form located at the bottom of the page.
Further Reading
- Frank McCown, Producing Simple Graphs with R
- the R Graph Gallery
- How to make any plot in ggplot2?
- The Complete ggplot2 Tutorial – Part 2 | How To Customize ggplot2 (Full R code)
- Combine Multiple GGPlots in One Graph