library(tidyverse)
library(highcharter)
library(xts)
Data visualization is a powerful tool that enables us to communicate complex information effectively and intuitively. In today’s data-driven world, where vast amounts of information are generated daily, the ability to create meaningful visual representations of data is essential for analysis, exploration, and decision-making. Among the plethora of tools available for data visualization, Highcharter stands out as a versatile and user-friendly option for creating interactive visualizations within the R programming environment.
In this blog post, we’ll explore how Highcharter can be leveraged to create interactive and engaging visualizations in R.
Install and load Packages
Before moving any further with this post, you will need to install and load three libraries: tidyverse, highcharter, and xts.
Tidyverse - is a collection of R packages designed for data science and statistical analysis. It provides a coherent and consistent framework for working with data by promoting a tidy data structure and emphasizing a grammar of data manipulation. The tidyverse includes packages such as dplyr for data manipulation, ggplot2 for data visualization, tidyr for data tidying, readr for data import, and several others.
Highcharter - The highcharter package provides an interface to the Highcharts JavaScript library, allowing R users to create a wide variety of charts, including line charts, bar charts, scatter plots, heatmaps, and more, with interactive features such as zooming, tooltips, and drill-down capabilities.
Xts - eXtensible Time Series is an R package designed for handling time series data. It provides an extensible framework for creating, manipulating, and analyzing time series objects in R.
Install CRAN version:
install.packages("tidyverse")
install.packages("highcharter")
install.packages("xts")
Load packages:
Datasets
The R package datasets will be used to create bar, pie, scatter, and line charts using highcharter.
mpg is a ggplot2 dataset that contains a subset of the fuel economy data that the EPA makes available on https://fueleconomy.gov/. It contains only models which had a new release every year between 1999 and 2008 - this was used as a proxy for the popularity of the car.
Format - A data frame with 234 rows and 11 variables:
manufacturer
- manufacturer namemodel
- model namedispl
- engine displacement, in litresyear
- year of manufacturecyl
- number of cylinderstrans
- type of transmissiondrv
- the type of drive train, where f = front-wheel drive, r = rear wheel drive, 4 = 4wdcty
- city miles per gallonhwy
- highway miles per gallonfl
- fuel typeclass
- “type” of car
View mpg dataset
mpg
# A tibble: 234 × 11
manufacturer model displ year cyl trans drv cty hwy fl class
<chr> <chr> <dbl> <int> <int> <chr> <chr> <int> <int> <chr> <chr>
1 audi a4 1.8 1999 4 auto… f 18 29 p comp…
2 audi a4 1.8 1999 4 manu… f 21 29 p comp…
3 audi a4 2 2008 4 manu… f 20 31 p comp…
4 audi a4 2 2008 4 auto… f 21 30 p comp…
5 audi a4 2.8 1999 6 auto… f 16 26 p comp…
6 audi a4 2.8 1999 6 manu… f 18 26 p comp…
7 audi a4 3.1 2008 6 auto… f 18 27 p comp…
8 audi a4 quattro 1.8 1999 4 manu… 4 18 26 p comp…
9 audi a4 quattro 1.8 1999 4 auto… 4 16 25 p comp…
10 audi a4 quattro 2 2008 4 manu… 4 20 28 p comp…
# ℹ 224 more rows
globaltemp is a highcharter dataset that contains temperature information by years, sourced by the climate lab book.
Format - A data frame with 1992 observations and 4 variables.
date
- datelower
- minimum temperaturemedian
- median temperatureupper
- maximum temperature
View globaltemp dataset
globaltemp
# A tibble: 1,992 × 4
date median lower upper
<date> <dbl> <dbl> <dbl>
1 1850-01-01 -0.702 -1.10 -0.299
2 1850-02-01 -0.284 -0.675 0.114
3 1850-03-01 -0.732 -1.08 -0.383
4 1850-04-01 -0.57 -0.903 -0.237
5 1850-05-01 -0.325 -0.662 0.006
6 1850-06-01 -0.213 -0.515 0.084
7 1850-07-01 -0.128 -0.458 0.199
8 1850-08-01 -0.233 -0.596 0.132
9 1850-09-01 -0.444 -0.818 -0.071
10 1850-10-01 -0.452 -0.794 -0.105
# ℹ 1,982 more rows
vaccines is a highcharter dataset that contains the number of infected people by Measles, measured over 70-some years and across all 50 states. From the WSJ analysis: http://graphics.wsj.com/infectious-diseases-and-vaccines/
Format - A data frame with 3,876 observations and 3 variables.
year
- yearstate
- name of the statecount
- number of cases per 100,000 people. If the value is NA the count was 0
View vaccines dataset
vaccines
# A tibble: 3,876 × 3
year state count
<int> <chr> <dbl>
1 1928 Alabama 335.
2 1928 Alaska NA
3 1928 Arizona 201.
4 1928 Arkansas 482.
5 1928 California 69.2
6 1928 Colorado 207.
7 1928 Connecticut 635.
8 1928 Delaware 256.
9 1928 District Of Columbia 536.
10 1928 Florida 120.
# ℹ 3,866 more rows
Highcharter Functions
hchart():
This function is used to create a highchart object directly from a data frame or other R objects.
It simplifies the process of creating charts by automatically inferring the chart type and mapping data variables to visual properties.
hcaes():
This function specifies the aesthetics mappings for the chart.
It maps data variables to visual properties of the chart, such as x-axis, y-axis, color, size, etc.
hc_xAxis()
and hc_yAxis():
These functions configure the x-axis and y-axis of the chart, respectively.
They allow customization of axis titles, labels, tick marks, and other properties.
hc_title():
This function sets the title of the chart.
It allows customization of the main title displayed above the chart.
hc_exporting():
his function enables exporting functionality for the chart.
It allows users to download the chart as a png, jpeg, pdf, svg, vector, csv, and xls etc.
hc_add_theme():
This function applies a theme to the chart.
It allows customization of chart appearance, such as colors, fonts, and backgrounds.
hc_legend():
This function configures the appearance and position of the chart legend.
It allows customization of legend title, labels, alignment, and other properties.
hc_colors():
This function sets a custom color palette for the chart.
It allows specifying a vector of colors to be used for different data series, points, or other visual elements.
Highcharter Parameters
type:
- This parameter specifies the type of chart to be created, such as “line”, “bar”, “scatter”, etc.
color:
- This parameter sets the color of data series, points, or other visual elements in the chart.
dataLabels:
This parameter controls the display of data labels on the chart.
It allows customization of the format, position, and appearance of data labels.
name:
- This parameter sets the name or label of a data series in the chart legend.
enabled:
- This parameter specifies whether a particular feature, such as data labels or exporting functionality, is enabled or disabled.
format:
- This parameter specifies the format of data labels or other text elements in the chart.
showInLegend:
This parameter specifies whether a data series or point should be displayed in the chart legend.
It can be set to
TRUE
orFALSE
to control visibility in the legend.
backgroundColor:
This parameter sets the background color of the chart or specific chart elements.
It accepts color values in various formats, such as hexadecimal codes or named colors.
text:
This parameter sets the text content for various chart elements, such as titles, labels, or tooltips.
It allows customization of text appearance, formatting, and positioning.
group:
This parameter is used in conjunction with
hcaes()
to group data points for various purposes, such as creating facets or subsets within the chart.It allows grouping data points based on a categorical variable, which can be useful for creating multiple series, facets, or subsets within the chart.
Bar Charts
Bar charts are used to compare categorical data or to track changes over time.
|>
mpg group_by(class) |>
summarise(number_of_cars = n()) |>
arrange(desc(number_of_cars)) |>
hchart("bar", hcaes(x = class, y = number_of_cars),
color = "#5c6f7e",
dataLabels = list(enabled = TRUE, format = "{y}"),
name = "Number of cars") |>
hc_xAxis(title = list(text = "Car type")) |>
hc_yAxis(title = list(text = "Number of cars"),
labels = list(format = "{value}")) |>
hc_title(text = list("Distribution of Car Types")) |>
hc_exporting(enabled = TRUE) |>
hc_add_theme(hc_theme(chart = list(backgroundColor = "white")))
|>
mpg group_by(class) |>
summarise(number_of_cars = n()) |>
arrange(desc(number_of_cars)) |>
hchart("column", hcaes(x = class, y = number_of_cars),
color = "#5c6f7e",
dataLabels = list(enabled = TRUE, format = "{y}"),
name = "Number of cars") |>
hc_xAxis(title = list(text = "Car type")) |>
hc_yAxis(title = list(text = "Number of cars"),
labels = list(format = "{value}")) |>
hc_title(text = list("Distribution of Car Types")) |>
hc_exporting(enabled = TRUE) |>
hc_add_theme(hc_theme(chart = list(backgroundColor = "white")))
|>
mpg group_by(class, drv) |>
summarise(number_of_cars = n()) |>
arrange(desc(number_of_cars)) |>
hchart("column", hcaes(x = class, y = number_of_cars, group= drv),
dataLabels = list(enabled = TRUE, format = "{y}")) |>
hc_xAxis(title = list(text = "Car type")) |>
hc_yAxis(title = list(text = "Number of Cars"),
labels = list(format = "{value}")) |>
hc_title(text = list("Distribution of Car Types by Drive Train")) |>
hc_legend(title = list(text = "Type of Drive Train")) |>
hc_exporting(enabled = TRUE) |>
hc_add_theme(hc_theme(chart = list(backgroundColor = "white")))
|>
mpg group_by(class, drv) |>
summarise(number_of_cars = n()) |>
arrange(desc(number_of_cars)) |>
hchart("column", hcaes(x = class, y = number_of_cars, group= drv),
dataLabels = list(enabled = TRUE, format = "{y}"),
stacking = "normal") |>
hc_colors(c("#005383", "#5c6f7e", "#dc3545")) |>
hc_xAxis(title = list(text = "Type of car")) |>
hc_yAxis(title = list(text = "Number of cars"),
labels = list(format = "{value}")) |>
hc_title(text = list("Distribution of Car Types by Drive Train")) |>
hc_legend(title = list(text = "Type of Drive Train")) |>
hc_exporting(enabled = TRUE) |>
hc_add_theme(hc_theme(chart = list(backgroundColor = "white")))
Pie Charts
Pie charts are effective for showing the composition or distribution of categorical data, such as market shares or proportions of a whole.
|>
mpg group_by(drv) |>
summarise(number_of_cars = n()) |>
arrange(desc(number_of_cars)) |>
hchart("pie", hcaes(x = drv, y = number_of_cars),
dataLabels = list(format = "<b>{point.name}</b>:<br>{point.number_of_cars}"),
name = "Number of cars",
showInLegend = TRUE) |>
hc_colors(c("#dc3545", "#5c6f7e", "orange")) |>
hc_title(text = list("Drive Train Distribution")) |>
hc_legend(title = list(text = "Type of Drive Train")) |>
hc_exporting(enabled = FALSE) |>
hc_add_theme(hc_theme(chart = list(backgroundColor = "white")))
|>
mpg group_by(drv) |>
summarise(number_of_cars = n()) |>
mutate(percentage_of_cars = round(number_of_cars/sum(number_of_cars)*100,1)) |>
arrange(desc(percentage_of_cars)) |>
hchart("pie", hcaes(x = drv, y = percentage_of_cars),
dataLabels = list(format = "<b>{point.name}</b>:<br>
{point.percentage_of_cars:.1f}%"),
name = "Percentage of cars",
showInLegend = TRUE) |>
hc_colors(c("#dc3545", "#5c6f7e", "orange")) |>
hc_title(text = list("Drive Train Distribution")) |>
hc_legend(title = list(text = "Type of Drive Train")) |>
hc_exporting(enabled = FALSE) |>
hc_add_theme(hc_theme(chart = list(backgroundColor = "white")))
|>
mpg group_by(drv) |>
summarise(number_of_cars = n()) |>
mutate(percentage_of_cars = round(number_of_cars/sum(number_of_cars)*100,1)) |>
arrange(desc(percentage_of_cars)) |>
hchart("column", hcaes(x = drv, y = percentage_of_cars),
color = "#5c6f7e",
dataLabels = list(enabled = TRUE, format = "{y}%"),
name = "Percentage of cars") |>
hc_xAxis(title = list(text = "Type of drive train")) |>
hc_yAxis(title = list(text = "Percentage of cars"),
labels = list(format = "{value}%")) |>
hc_title(text = list("Drive Train Distribution")) |>
hc_exporting(enabled = TRUE) |>
hc_add_theme(hc_theme(chart = list(backgroundColor = "white")))
|>
mpg group_by(drv) |>
summarise(number_of_cars = n()) |>
mutate(percentage_of_cars = round(number_of_cars/sum(number_of_cars)*100,1)) |>
arrange(desc(percentage_of_cars)) |>
hchart("column", hcaes(x = drv, y = percentage_of_cars,
color = c("#dc3545", "#5c6f7e", "#005383")),
dataLabels = list(enabled = TRUE, format = "{y}%"),
name = "Percentage of cars") |>
hc_xAxis(title = list(text = "Type of drive train")) |>
hc_yAxis(title = list(text = "Percentage of cars"),
labels = list(format = "{value}%")) |>
hc_title(text = list("Drive Train Distribution")) |>
hc_exporting(enabled = TRUE) |>
hc_add_theme(hc_theme(chart = list(backgroundColor = "white")))
Scatter Charts
Scatter plots are used to visualize relationships between two continuous variables, such as correlation or clustering patterns.
|>
mpg hchart("scatter", hcaes(x = displ, y = cty),
color = "orange") |>
hc_xAxis(title = list(text = "Engine displacement, in litres")) |>
hc_yAxis(title = list(text = "City miles per gallon")) |>
hc_title(text = list("Engine Displacement (in litres) vs City Miles Per Gallon")) |>
hc_exporting(enabled = TRUE) |>
hc_add_theme(hc_theme(chart = list(backgroundColor = "white")))
|>
mpg hchart("scatter", hcaes(x = displ, y = cty, group = drv)) |>
hc_xAxis(title = list(text = "Engine displacement, in litres")) |>
hc_yAxis(title = list(text = "City miles per gallon")) |>
hc_title(text = list("Engine Displacement (in litres) vs City Miles Per Gallon
According to the Type of Drive Train")) |>
hc_legend(title = list(text = "Type of Drive Train")) |>
hc_exporting(enabled = TRUE) |>
hc_add_theme(hc_theme(chart = list(backgroundColor = "white")))
|>
mpg hchart("scatter", hcaes(x = displ, y = cty, group = drv)) |>
hc_xAxis(title = list(text = "Engine displacement, in litres")) |>
hc_yAxis(title = list(text = "City miles per gallon")) |>
hc_title(text = list("Engine Displacement (in litres) vs City Miles Per Gallon
According to the Type of Drive Train")) |>
hc_legend(title = list(text = "Type of Drive Train")) |>
hc_exporting(enabled = TRUE) |>
hc_colors(c("#dc3545", "#5c6f7e", "orange")) |>
hc_add_theme(hc_theme(chart = list(backgroundColor = "white")))
Line Charts
Line charts are commonly used to display trends over time or to track changes in data continuously.
|>
globaltemp mutate(year = year(date)) |>
group_by(year) |>
summarise(average_minimum_tempature = round(mean(lower),2)) |>
hchart("line", hcaes(x = year, y = average_minimum_tempature),
color = "#005383",
name = "Average Minimum Tempature") |>
hc_xAxis(title = list(text = "Year")) |>
hc_yAxis(title = list(text = "Average Minimum Temperature")) |>
hc_title(text = list("Average Global Minimum Temperature over the Years")) |>
hc_exporting(enabled = TRUE) |>
hc_add_theme(hc_theme(chart = list(backgroundColor = "white")))
|>
vaccines filter(state %in% c("Florida", "California")) |>
mutate(count = ifelse(is.na(count), 0, count)) |>
hchart("line", hcaes(x = year, y = count, group = state)) |>
hc_xAxis(title = list(text = "Year")) |>
hc_yAxis(title = list(text = "Number of cases per 100k people")) |>
hc_title(text = list("Measles Infected Cases per 100k People
in Florida & California")) |>
hc_colors(c("#dc3545", "#5c6f7e")) |>
hc_exporting(enabled = TRUE) |>
hc_add_theme(hc_theme(chart = list(backgroundColor = "white")))
Create an extensible time-series object
<- xts(x = globaltemp$lower,
globaltemp_xts order.by = globaltemp$date)
highchart(type = "stock") |>
hc_add_series(globaltemp_xts,
type = "line",
color = "#005383",
name = "Minimum Temperature") |>
hc_xAxis(title = list(text = "Date")) |>
hc_yAxis(title = list(text = "Global Minimum Temperature"),
opposite = FALSE) |>
hc_title(text = list("Global Minimum Temperature over the Years")) |>
hc_exporting(enabled = TRUE) |>
hc_add_theme(hc_theme(chart = list(backgroundColor = "white")))
Your friendly neighborhood data scientist