Sales Funnel visualization with R

Sales (purchasing or conversion) Funnel is a consumer-focused marketing model which illustrates the theoretical customer journey towards the purchase of a product or service. Classically, the Sales funnel includes at least four steps:

  • Awareness – the customer becomes aware of the existence of a product or service,
  • Interest – actively expressing an interest in a product group,
  • Desire – aspiring to a particular brand or product,
  • Action – taking the next step towards purchasing the chosen product.

There can be more steps and this depends on the business model and level of detail you want to/can represent. Obviously, there is quite a lot of creative work in order to establish the model and define the steps of the customer journey in practice. Once you managed this it is always interesting to see the result in visual form. Usually, the Sales Funnel looks like a tornado and it narrows from the top to the bottom. Therefore, the Sales Funnel allows you to find where there is a bottle funnel neck or what is the step where you lose customers.

We will study a very simple example of creating a Sales Funnel by focusing on visualization in this post.

If we speak about e-commerce, we can use content of the site for defining steps. Let’s assume that our main page and landing pages of ad campaigns are responsible for the Awareness step. The pages with product or service descriptions correspond to the Interest from customers. Shopping cart page confirms the Desire. Finally, the conversion page (“thank you page”) is the Action. In other words, our theoretical customer journey looks like the following:

  • customer’s seen our advertisement and clicked on it ->
  • landed on the landing page ->
  • was interested and moved to the product description ->
  • wanted the product and added it to the shopping cart ->
  • purchased.

Note: you can distribute all pages to the Sales Funnel steps with the content grouping option in Google Analytics. Further, you can find a lot of examples on how to extract data from Google Analytics directly from R via API. And if you use the Google Analytics content grouping option you can extract these groups.

Let’s say we have a dataset with the grouped pages and number of sessions (visits). Further, we need to define which pages belong to each step of the funnel:

content step number
main
awareness
150000
ad landing
awareness
80000
product 1
interest
80000
product 2
interest
40000
product 3
interest
35000
product 4
interest
25000
shopping cart
desire
130000
thank you page
action
120000

Furthermore, we can make some improvements to the bottom of the Sales Funnel. If you use LifeCycle Grids you can identify the customer’s lifecycle phase. Therefore, we can represent a breakdown of customers who made purchases into their status (e.g. new customer, engaged and loyal). If you are not familiar with the LifeCycle Grids approach, please start here.

Here is our data set with the number of customers and their statuses:

 content step number 
new
new
25000
engaged
engaged
40000
loyal
loyal
55000

By using the same column names with the content table, it is easier to combine data.

The logic behind the R code is simple: we need to combine tables, define the order of steps, calculate dummy values for centering bars, calculate a share of session proceeded to the next step and plot the result:

sales_funnel

Let’s go through the plot: there are stacked bars on the top of the chart that represent the total number of sessions (visits) at each step of Sales Funnel. 78.3% of sessions went from the Awareness step to the Interest one, 72.2% from the Interest to the Desire and 92.3% from the Desire to the Action. Even though it is not necessary, we used another logic at the bottom of the Funnel. If we can identify the customer’s type, we can split the total number of Actions (120K in our case) into different customer types (e.g. new, engaged and loyal customers). Further, we calculated the share of each customer type in the Action (20.8%, 33.3% and 45.8%). If we aren’t new in the market and have a number of loyal customers, we would take the hourglass view of the Sales Funnel.

click to expand R code

library(dplyr)
library(ggplot2)
library(reshape2)

# creating a data samples
# content
df.content <- data.frame(content = c('main', 'ad landing',
 'product 1', 'product 2', 'product 3', 'product 4',
 'shopping cart',
 'thank you page'),
 step = c('awareness', 'awareness',
 'interest', 'interest', 'interest', 'interest',
 'desire',
 'action'),
 number = c(150000, 80000,
 80000, 40000, 35000, 25000,
 130000,
 120000))
# customers
df.customers <- data.frame(content = c('new', 'engaged', 'loyal'),
 step = c('new', 'engaged', 'loyal'),
 number = c(25000, 40000, 55000))
# combining two data sets
df.all <- rbind(df.content, df.customers)

# calculating dummies, max and min values of X for plotting
df.all <- df.all %>%
 group_by(step) %>%
 mutate(totnum = sum(number)) %>%
 ungroup() %>%
 mutate(dum = (max(totnum) - totnum)/2,
 maxx = totnum + dum,
 minx = dum)

# data frame for plotting funnel lines
df.lines <- df.all %>%
 distinct(step, maxx, minx)

# data frame with dummies
df.dum <- df.all %>%
 distinct(step, dum) %>%
 mutate(content = 'dummy',
 number = dum) %>%
 select(content, step, number)

# data frame with rates
conv <- df.all$totnum[df.all$step == 'action']

df.rates <- df.all %>%
 distinct(step, totnum) %>%
 mutate(prevnum = lag(totnum),
 rate = ifelse(step == 'new' | step == 'engaged' | step == 'loyal',
 round(totnum / conv, 3),
 round(totnum / prevnum, 3))) %>%
 select(step, rate)
df.rates <- na.omit(df.rates)

# creting final data frame
df.all <- df.all %>%
 select(content, step, number)

df.all <- rbind(df.all, df.dum)

# defining order of steps
df.all$step <- factor(df.all$step, levels = c('loyal', 'engaged', 'new', 'action', 'desire', 'interest', 'awareness'))
df.all <- df.all %>%
 arrange(desc(step))
list1 <- df.all %>% distinct(content) %>%
 filter(content != 'dummy')
df.all$content <- factor(df.all$content, levels = c(as.character(list1$content), 'dummy'))

# calculating position of labels
df.all <- df.all %>%
 arrange(step, desc(content)) %>%
 group_by(step) %>%
 mutate(pos = cumsum(number) - 0.5*number) %>%
 ungroup()

# creating custom palette with 'white' color for dummies
cols <- c("#fec44f", "#fc9272", "#a1d99b", "#fee0d2",
 "#2ca25f", "#8856a7", "#43a2ca", "#fdbb84",
 "#e34a33", "#a6bddb", "#dd1c77", "#ffffff")

# plotting chart
ggplot() +
 theme_minimal() +
 coord_flip() +
 scale_fill_manual(values=cols) +
 geom_bar(data=df.all, aes(x=step, y=number, fill=content), stat="identity", width=1) +
 geom_text(data=df.all[df.all$content!='dummy', ],
 aes(x=step, y=pos, label=paste0(content, '-', number/1000, 'K')),
 size=4, color='white', fontface="bold") +
 geom_ribbon(data=df.lines, aes(x=step, ymax=max(maxx), ymin=maxx, group=1), fill='white') +
 geom_line(data=df.lines, aes(x=step, y=maxx, group=1), color='darkred', size=4) +
 geom_ribbon(data=df.lines, aes(x=step, ymax=minx, ymin=min(minx), group=1), fill='white') +
 geom_line(data=df.lines, aes(x=step, y=minx, group=1), color='darkred', size=4) +
 geom_text(data=df.rates, aes(x=step, y=(df.lines$minx[-1]), label=paste0(rate*100, '%')), hjust=1.2,
 color='darkblue', fontface="bold") +
 theme(legend.position='none', axis.ticks=element_blank(), axis.text.x=element_blank(), 
 axis.title.x=element_blank())