Colour Gradient Scale with scale_fill_gradientn in R

scale_fill_gradientn is function in ggplot2, a package for creating and customizing graphics in R. Explore the arguments, examples of this function

Key points

  • The scale_fill_gradientn is a powerful function of ggplot2 for data visualization customization.
  • It is a function in the ggplot2 package that allows you to create an n-color gradient scale for fill aesthetics by specifying a vector of colors and adjusting other parameters.
  • It is useful for customizing the color scheme of your plots by highlighting patterns, trends, or differences in your data and making your plots more appealing and informative.
  • It can be applied to different types of plots, such as heat maps, choropleth maps, and contour plots, that use fill aesthetics to represent a numeric variable.
  • It has several parameters that you can adjust to customize your plot further, such as the values, the space, the na.value, and the guide that controls the position, the interpolation, the missing value, and the legend of the gradient scale.
  • It can be combined with other functions, such as RColorBrewer, to use custom color palettes or deal with missing values and outliers.
How to Create a Colour Gradient Scale with scale_fill_gradientn in R

Functions and Description

Function Description
scale_fill_gradientn Creates a n-colour gradient scale for fill aesthetics, by specifying a vector of colors and adjusting other parameters
scale_fill_gradient Creates a two-colour gradient scale for fill aesthetics, by specifying two colors and adjusting other parameters
scale_fill_gradient2 Creates a three-colour gradient scale for fill aesthetics, by specifying three colors and a midpoint and adjusting other parameters
scale_fill_distiller Creates a gradient scale for fill aesthetics, by specifying a color palette name and adjusting other parameters
scale_fill_viridis Creates a gradient scale for fill aesthetics, by specifying a viridis color palette name and adjusting other parameters

Table of Contents

Hi, I’m Zubair Goraya, a Ph.D. scholar, a certified data analyst, and a freelancer with 5 years of experience. I’m passionate about unraveling insights through data and sharing them with others. In this article, I’ll show you how to create a n-colour gradient scale with scale_fill_gradientn function in R, a powerful tool for data visualization.

What is Scale_fill_gradientn in R?

Scale_fill_gradientn is a function in the ggplot2 package that allows you to customize the color scheme of your plots by specifying a vector of colors that will be interpolated to create a continuous gradient. It can help you highlight patterns, trends, or differences in your data and make your plots more appealing and informative. 

Why use Scale_fill_gradientn for ggplot2 customization?

Scale_fill_gradientn is a function that allows you to create an n-color gradient scale for fill aesthetics in ggplot2. It is useful for ggplot2 customization because it gives you more flexibility and control over the color scheme of your plots compared to other functions that create gradient scales, such as scale_fill_gradient or scale_fill_distiller. 

With scale_fill_gradientn, you can specify a vector of colors that will be interpolated to create a continuous gradient and adjust other parameters, such as the values, the space, the na.value, and the guide, to customize your plots further. 

For example, you can use scale_fill_gradientn to create a rainbow gradient, a skewed gradient, a hue-based gradient, or a custom gradient with your colors. You can also use scale_fill_gradientn to develop different plots, such as heat maps, choropleth maps, and contour plots, highlight patterns, trends, or differences in your data, and make your plots more appealing and informative.

You’ll need to have R and RStudio installed on your computer to follow along with the tutorial. You’ll also need to have some basic knowledge of R and ggplot2. If you’re new to R or ggplot2, you can check out some of my previous articles on Data Analysis, a data analysis website that provides tutorials related to RStudio. You can also find more resources and references at the end of this article.

How to use scale_fill_gradientn

Before we can use scale_fill_gradientn, we need to install and load the ggplot2 package, which is the main package for creating and customizing graphics in R. We’ll also use the dplyr package, which is a handy package for data manipulation in R. To install and load these packages:

# Install ggplot2 and dplyr packages if not already installed
if (!require(ggplot2)) install.packages("ggplot2")
if (!require(dplyr)) install.packages("dplyr")
# Load ggplot2 and dplyr packages
library(ggplot2)
library(dplyr)
People Also Read:

Import and Prepare the data for Data Visualization

For this tutorial, we’ll use a sample dataset called diamonds, which is included in the ggplot2 package. The dataset contains information about 53,940 diamonds, such as their carat, cut, color, clarity, price, and dimensions. To load and inspect this dataset, we can run the following code:

# Load diamonds dataset
data(diamonds)
# Inspect diamonds dataset
head(diamonds)

For this tutorial, we’ll focus on the relationship between the carat and the price of the diamonds and how it varies by the cut quality. 

Load diamonds dataset
To simplify our analysis, we’ll only use a subset of the diamonds dataset that contains these three variables. To create this subset, we can use the select() function from the dplyr package, which allows us to pick the columns we want to keep. We can also use the filter() function to remove any rows with missing or zero values in any of the columns. We can run the following code to create and inspect our subset:

Create a subset of diamonds dataset with carat, cut, and price columns
# Create a subset of diamonds dataset with carat, cut, and price columns diamonds_subset <- diamonds %>% select(carat, cut, price) %>% filter(!is.na(carat), !is.na(cut), !is.na(price), carat > 0, price > 0) # Inspect diamonds_subset head(diamonds_subset)

The output shows the first six rows of the diamonds_subset, which has three columns.

Create a Basic Plot with the fill aesthetic

To fill aesthetic maps, a variable to the color of the area of a geometric object, such as a bar, a tile, or a polygon. To create a basic plot with the fill aesthetic, we can use the ggplot() function to create a ggplot2 object and then add a geom_() layer to specify the type of geometric object we want to plot. For example, we can use geom_bar() to create a bar plot, geom_tile() to create a tile plot, or geom_polygon() to create a polygon plot.

Basic Plot with fill aesthetic
In this tutorial, we’ll use geom_point() to create a scatter plot of the carat and the price of the diamonds and map the cut variable to the fill aesthetic. It will create a plot with points that have different colors based on the cut quality. 

# Create a basic plot with a fill aesthetic
ggplot(diamonds_subset, aes(x = carat, y = price, fill = cut)) +
  geom_point(shape = 21, alpha = 0.5)

In the code above, we used the following arguments:

  • data: the name of the data frame we want to plot
  • aes: the aesthetic mapping, where we specify which variables we want to map to which aesthetics
  • x: the x-axis variable
  • y: the y-axis variable
  • fill: the fill variable
  • shape: the shape of the points; we used 21, which is a filled circle
  • alpha: the transparency of the points, we used 0.5, which is 50% transparent

As you can see, the Plot shows the relationship between the carat and the price of the diamonds and how it varies by the cut quality. However, the default color scheme could be more attractive and informative. It uses a discrete scale that assigns a different color to each level of the cut variable, but the colors are not ordered or meaningful. Moreover, the Plot is crowded and hard to read, especially for the lower carat and price values.

Related Posts

Plot with scale_fill_gradientn

It is where scale_fill_gradientn comes in handy. Scale_fill_gradientn allows us to customize the color scheme of our Plot by specifying a vector of colors that will be interpolated to create a continuous gradient. 

It can help us create a more appealing and informative plot that shows the variation of the cut quality along a spectrum of colors. To use scale_fill_gradientn, we need to add it to our Plot and provide a vector of colors as an argument. For example, we can use the following code to create a plot with a blue-to-red gradient:

# Create a plot with scale_fill_gradientn
ggplot(diamonds_subset, aes(x = carat, y = price, fill = as.numeric(cut))) +
  geom_point(shape = 21, alpha = 0.5) +
  scale_fill_gradientn(colors = c("blue", "red"))
In the code above, we used the following argument:
Plot with scale_fill_gradientn

Colors: a vector of colors that will be interpolated to create a gradient; we used c(“blue”, “red”), which means blue for the lowest value of cut and red for the highest value of cut.

As you can see, the Plot now shows a more clear and meaningful color scheme that reflects the variation of the cut quality. The points with the lowest cut quality (Fair) are blue, while those with the highest (Ideal) are red. The points with the intermediate cut quality (Good, Very Good, Premium) are shades of purple. Makes it easier to see the patterns and trends in the data, such as how the cut quality tends to increase with the carat and the price of the diamonds.

Parameters of scale_fill_gradientn

Scale_fill_gradientn has several parameters that we can adjust to customize our Plot further. Here are some of the most important ones:

Values

A vector of numeric values between 0 and 1 that indicates the position of each color along the gradient. The values are equally spaced by default, but we can change them to create a non-linear gradient. For example, we can use the following code to create a plot with a skewed gradient that emphasizes the lower values of cut:

# Create a plot with a skewed gradient
ggplot(diamonds_subset, aes(x = carat, y = price, fill = as.numeric(cut))) +
  geom_point(shape = 21, alpha = 0.5) +
  scale_fill_gradientn(colors = c("blue", "red"), values = c(0.2, 0.4, 0.6, 1))

In the code above, we used the next argument:

Create a plot with a skewed gradient

A vector of numeric values between 0 and 1 that indicates the position of each color along the gradient; we used c(0, 0.2, 0.4, 0.6, 1), which means blue for 0, purple for 0.2, 0.4, and 0.6, and red for 1. 

As you can see, the Plot now shows a more skewed gradient that emphasizes the lower values of cut. The points with the lowest cut quality (Fair) are grey, but those with the highest cut quality (Ideal) are less red. The points with the intermediate cut quality (Good, Very Good, Premium) are more purple. It makes it easier to see the variation of the cut quality for the lower carat and price values.

Space

By default, the space is “Lab”, a perceptually uniform color space that mimics how humans perceive colors. However, we can change the space to other options, such as “rgb”, “hcl”, or “hsv”. For example, we can use the following code to create a plot with a hue-based gradient:

# create a plot with hue-based gradient
ggplot(diamonds_subset, aes(x = carat, y = price, fill = as.numeric(cut))) +
  geom_point(shape = 21, alpha = 0.5) +
  scale_fill_gradientn(colors = c("blue", "red"), space = "HSV")

In the code above, we used the next argument:

Plot with hue-based gradient
The color space in which the colors are interpolated; we used “hsv” which means hue, saturation, and value
As you can see, the Plot now shows a different gradient that interpolates the colors based on their hue. The points with the lowest cut quality (Fair) are still blue, but those with the highest cut quality (Ideal) are more orange than red. The points with the intermediate cut quality (Good, Very Good, Premium) are shades of green and yellow. This makes it easier to see the contrast of the cut quality across the spectrum of colors.

na.value (Missing Values)

Plot with white color for missing values
By default, the na.value is “grey50”, which is a medium grey color. However, we can change the na.value to any color we want, such as “white”, “black”, or a hexadecimal code. For example, we can use the following code to create a plot with a white color for missing values:

# Create a plot with white color for missing values
ggplot(diamonds_subset, aes(x = carat, y = price, fill = as.numeric(cut))) +
  geom_point(shape = 21, alpha = 0.5) +
  scale_fill_gradientn(colors = c("blue", "red"), na.value = "white")

As you can see, the Plot shows a white color for missing values. It makes it easier to see the presence and distribution of missing values in the data.

Guide

By default, the guide is “colourbar”, which is a continuous color bar that shows the range of values and colors. However, we can change the guide to other options, such as “legend”, which is a discrete legend that shows the levels and colors. For example, we can use the following code to create a plot with a discrete legend:

# Create a plot with discrete legend
ggplot(diamonds_subset, aes(x = carat, y = price, fill = as.numeric(cut))) +
  geom_point(shape = 21, alpha = 0.5) +
  scale_fill_gradientn(colors = c("blue", "red"), guide = "legend")

The type of legend to use, we used “legend” which means discrete legend.
Plot with discrete legend

As you can see, the Plot now shows a discrete legend that shows the levels and colors of the cut variable. This makes it easier to see the correspondence between the colors and the values.

These are some of the main parameters of scale_fill_gradientn that we can adjust to further customize our Plot. However, there are more parameters that we can explore, such as the breaks, the labels, the limits, and the aesthetics. For more details, you can check the documentation of scale_fill_gradientn.

Examples of scale_fill_gradientn

Now that we know how to use scale_fill_gradientn and its parameters let’s see some examples of how we can apply it to different types of plots. We’ll use scale_fill_gradientn to create a heat map, and a choropleth map.

How to create a heat map with scale_fill_gradientn

A heat map is a type of plot that shows the distribution of a numeric variable across two categorical variables, using colors to represent the values. To create a heat map with scale_fill_gradientn, we can use the geom_tile() function to create a tile plot and map the numeric variable to the fill aesthetic. For example, we can use the following code to create a heat map of the average price of the diamonds by cut and color:

# Create a heat map of the average price by cut and color
ggplot(diamonds, aes(x = cut, y = price, fill = as.numeric(color))) +
  geom_tile() +
  scale_fill_gradientn(colors = c("blue", "red"))

In the code above, we used the following arguments:

Heat map of the price by cut and color
  • x: the x-axis variable, we used cut, which is a categorical variable
  • y: the y-axis variable, we used price, which is another numeric variable
  • fill: the fill variable, we used color, which is a numeric variable that represents the average price of the diamonds for each combination of cut and color
  • colors: a vector of colors that will be interpolated to create a gradient. We used c(“blue”, “red”), which means blue for the lowest value of mean(price) and red for the highest value of mean(price)

How to customize a heat map with scale_fill_gradientn

We can also adjust the parameters of scale_fill_gradientn to customize our heat map, such as the values, the space, the na.value, and the guide. For example, we can use the following code to create a heat map with a skewed gradient, a hue-based space, a white color for missing values, and a discrete legend:
Heat map with customized scale_fill_gradientn
# Create a heat map with customized scale_fill_gradientn ggplot(diamonds, aes(x = cut, y = price, fill = as.numeric(color))) + geom_tile() + scale_fill_gradientn(colors = c("blue", "red"), values = c(0, 0.2, 0.4, 0.6, 1), space = "hsv", na.value = "white", guide = "legend")

As you can see, the plot now shows a different color scheme and legend that reflect our customization. 
The plot shows a more skewed gradient that emphasizes the lower values of color, a hue-based space that interpolates the colors based on their hue, a white color for missing values, and a discrete legend that shows the levels.

A heat map is a useful plot for showing the distribution of a numeric variable across two categorical variables, using colors to represent the values. We can use scale_fill_gradientn to customize the color scheme of our heat map by specifying a vector of colors and adjusting other parameters.

How to create a choropleth map with scale_fill_gradientn

A choropleth map is a type of plot that shows the distribution of a numeric variable across geographic regions, using colors to represent the values. To create a choropleth map with scale_fill_gradientn, we can use the geom_sf() function to create a spatial plot, and map the numeric variable to the fill aesthetic. For example, we can use the following code to create a choropleth map of the population density of the US states:

# Load sf and tigris packages
library(sf)
install.packages("tigris")
library(tigris)
# Download and prepare the US states data
states <- states(cb = TRUE) %>%
  select(NAME, ALAND) %>%
  rename(state = NAME, area = ALAND) %>%
  mutate(area = area / 1000000) # convert area to square kilometers
# Download and prepare the US population data
population <- read.csv("nst-est2020-alldata.csv") %>%
  select(NAME, POPESTIMATE2020) %>%
  rename(state = NAME, population = POPESTIMATE2020) %>%
  mutate(population = population / 1000000) # convert population to millions
# Join the US states and population data
us_data <- left_join(states, population, by = "state") %>%
  mutate(density = population / area) # calculate population density
# Create a choropleth map of the population density of the US states
ggplot(us_data, aes(fill = density)) +
  geom_sf() +
  scale_fill_gradientn(colors = c("blue", "red"))
In the code above, we used the following arguments:
Choropleth map of the population density of the US states

data: the name of the data frame we want to plot. We used us_data, which is a spatial data frame that contains the state name, area, population, and density.

fill the fill variable, we used density which is a numeric variable that represents the population density of the US states

colors: a vector of colors that will be interpolated to create a gradient, we used c(“blue”, “red”) which means blue for the lowest value of density and red for the highest value of density

As you can see, the plot shows the distribution of the population density of the US states, using a blue-to-red gradient to represent the values. The plot shows that the population density tends to be higher in the east coast and the west coast and lower in the central and northern regions. The plot also shows the variation of the population density within each state, such as how California has a high population density in the south and a low population density in the north.

Customize our choropleth map by using the scale_fill_gradientn function

We can also adjust the parameters of scale_fill_gradientn to customize our choropleth map, such as the values, the space, the na.value, and the guide. For example, we can use the following code to create a choropleth map with a log-scaled gradient, a hue-based space, a white color for missing values, and a continuous color bar:

# Create a choropleth map with customized scale_fill_gradientn
ggplot(us_data, aes(fill = density)) +
  geom_sf() +
  scale_fill_gradientn(colors = c("blue", "red"), trans = "log10", space = "hsv", na.value = "white", guide = "colourbar")
In the code above, we used the following arguments:
Choropleth map with customized scale_fill_gradientn

trans: the transformation to apply to the values, we used “log10” which means log base 10.

space: the color space in which the colors are interpolated, we used “hsv” which means hue, saturation, and value.

na.value: the color to use for missing values, we used “white” which means white.

guide: the type of legend to use, we used “colourbar” which means continuous color bar.

As we can see, the plot now shows a different color scheme and legend that reflect our customization. The plot shows a log-scaled gradient that reduces the skewness of the density values, a hue-based space that interpolates the colors based on their hue, a white color for missing values, and a continuous color bar that shows the range of values and colors of density.

A choropleth map is a useful plot for showing the distribution of a numeric variable across geographic regions, using colors to represent the values. We can use scale_fill_gradientn to customize the color scheme of our choropleth map, by specifying a vector of colors and adjusting other parameters.

How to create a contour plot with scale_fill_gradientn

A contour plot is a type of plot that shows the distribution of a numeric variable across two numeric variables, using contours to represent the values. To create a contour plot with scale_fill_gradientn, we can use the geom_density_2d_filled() function to create a filled density plot, and map the numeric variable to the fill aesthetic. For example, we can use the following code to create a contour plot of the density of the diamonds by carat and price:

# Create a contour plot of the density by carat and price
ggplot(diamonds, aes(x = carat, y = price, fill = as.numeric(after_stat(level)))) +
  geom_density_2d_filled() +
  scale_fill_gradientn(colors = c("blue", "red"))

In the code above, we used the following arguments:

Contour plot of the density by carat and pricex: For the x-axis variable, we used carat, which is a numeric variable.

y: For the y-axis variable, we used price, which is another numeric variable.

fill: For the fill variable, we used as.numeric(after_stat(level)), which is a special variable that represents the density level of the contours.

colors: a vector of colors that will be interpolated to create a gradient. We used c(“blue”, “red”), which means blue for the lowest value of the level and red for the highest value of the level.

As you can see, the plot shows the distribution of the density of the diamonds by carat and price, using a blue-to-red gradient to represent the values. The plot shows that the density tends to be higher for the lower carat and price values and lower for the higher carat and price values. The plot also shows the variation of the density within each contour, such as how the density decreases as the carat and price increase.

Conclusion

In this article, we learned how to create an n-colour gradient scale with scale_fill_gradientn function in R, a powerful tool for data visualization. We saw what scale_fill_gradientn does, why it is useful, how to use it and its parameters, and how to apply it to different types of plots. We also saw some examples of creating a heat map, a choropleth map, and a contour plot with scale_fill_gradientn.

Scale_fill_gradientn is a great function for customizing the color scheme of our plots by specifying a vector of colors that will be interpolated to create a continuous gradient. It can help us highlight patterns, trends, or differences in our data and make our plots more appealing and informative. We can also adjust other parameters of scale_fill_gradientn, such as the values, the space, the na.value, and the guide, to further customize our plots.

However, scale_fill_gradientn is not the only function that can create a gradient scale in R. There are other functions, such as scale_fill_gradient(), scale_fill_gradient2(), scale_fill_distiller(), and scale_fill_viridis(), that can create different types of gradient scales. For more details, you can check the documentation of these functions.

I hope you enjoyed this article and learned something new and useful. If you have any questions, comments, or feedback, please leave them below. And if you liked this article, please share it with others and help me grow. 

Frequently Asked Questions (FAQs)

What is scale_fill_gradientn in r ggplot2?

`scale_fill_gradientn` is a function that creates a color scale with n colors interpolated from a given vector of colors. It can be used to fill the plot areas with a gradient of colors based on a continuous variable.

What is the scale_fill_gradientn color palette?

`scale_fill_gradientn` takes an argument `colours` that specifies the vector of colours for the gradient. The colors can be given as names, hexadecimal codes, or RColorBrewer palettes. For example, `colours = c("red", "green", "blue")` will create a gradient from red to green to blue.

What is scale_fill_gradientn breaks?

`scale_fill_gradientn` also takes an argument `breaks` that specifies the values in the data where the colours change. By default, the breaks are evenly spaced along the range of the data, but they can be manually set to any numeric vector of the same length as `colours`. For example, `breaks = c(0, 0.5, 1)` will change the gradient from red to green at 0.5 and green to blue at 1.

What is the scale_fill_gradient midpoint?

`scale_fill_gradient` is a simpler version of `scale_fill_gradientn` that creates a two-color gradient from a low color to a high colour. It takes an argument `midpoint` that specifies the value in the data where the middle color is shown. By default, the midpoint is the mean of the data, but it can be manually set to any numeric value. For example, `midpoint = 0` will change the gradient from low to high at 0.

What are scale_fill_gradient limits?

`scale_fill_gradient` also takes an argument `limits` that specifies the range of the data that the color scale covers. By default, the limits are the minimum and maximum of the data, but they can be manually set to any numeric vector of length two. For example, `limits = c(-1, 1)` will make the color scale cover only the values between -1 and 1, and any values outside this range will be shown as missing values.

What is scale_fill_continuous?

`scale_fill_continuous` is a generic function that creates a continuous color scale based on the `type` argument. By default, the type is `"gradient"`, which means that `scale_fill_continuous` is equivalent to `scale_fill_gradient`. Alternatively, the type can be `"viridis"` or a function that returns a continuous color scale. For example, `type = "viridis"` will use the Viridis color palette, which is perceptually uniform and suitable for colorblind viewers.

What is the r color gradient by value?

To create a color gradient by value in R, one can use the `colorRamp` or `colorRampPalette` functions, which return functions that interpolate colors between a given vector of colours. For example, `pal <- color ramp (c("blue", "white", "red"))` will create a function `pal` that takes a numeric vector between 0 and 1 and returns the corresponding colors from blue to white to red. Then, `rgb(pal(0.5), maxColorValue = 255)` will return the color at the midpoint, which is `"#FFFFFF"` (white).

What is scale_fill_distiller?

`scale_fill_distiller` is a function that creates a continuous color scale from a discrete palette of colors, such as the ones from ColorBrewer. It takes an argument `palette` that specifies the name or number of the palette to use and an argument `type` that specifies whether the palette is sequential, diverging, or qualitative⁶. For example, `palette = "Spectral"` and `type = "div"` will use the diverging spectral palette from ColorBrewer. `scale_fill_distiller` also smoothly interpolates the colors from the palette to create a continuous gradient.



 

Need a Customized solution for your data analysis projects? Are you interested in learning through Zoom? Hire me as your data analyst. I have five years of experience and a PhD. I can help you with data analysis projects and problems using R and other tools. To hire me, you can visit this link and fill out the order form. You can also contact me at contact@rstudiodatalab.com for any questions or inquiries. I will be happy to work with you and provide you with high-quality data analysis services.


About the author

Zubair Goraya
Ph.D. Scholar | Certified Data Analyst | Blogger | Completed 5000+ data projects | Passionate about unravelling insights through data.

Post a Comment