R language uses Markov chain to model channel attribution in marketing | With code data

Original link: http://tecdat.cn/?p=5383

Recently we were asked by a client to write a research report on Markov chains, including some graphical and statistical output.

In this article, we look at what channel attribution is and how it relates to the concept of Markov chains

We will also understand how this concept works theoretically and practically (using R) through a case study of an e-commerce company.

What is channel attribution?

Google Analytics provides a standard set of rules for attribution modeling. According to Google, “An attribution model is a rule or set of rules that determines how sales and conversions are assigned to touchpoints in a conversion path. For example, the last interaction model in Google Analytics assigns Touchpoints (i.e., clicks) are assigned 100% credit. In contrast, the first interaction model assigns 100% credit to the touchpoint that initiated the conversion path.”

We will see the last interaction model and the first interaction model later in this article. Before that, let’s take a small example to understand more about channel attribution. Suppose we have a transformation graph as shown below:

Picture

In the above scenario, customers can start their journey through channel ‘C1’ or channel ‘C2’. The probability of starting with C1 or C2 is 50% (or 0.5). We first calculate the overall probability of conversion and then look further at the impact of each channel.

P(conversion) = P(C1→C2→C3→conversion) + P(C2→C3→conversion)

= 0.5 * 0.5 * 1 * 0.6 + 0.5 * 1 * 0.6

= 0.15 + 0.3

= 0.45

Markov chain

A Markov chain is a process that maps activities and gives probability distributions to move from one state to another. Markov chains are defined by three properties:

State space – the set of all possible states that may exist

Transition operation – the probability of moving from one state to another

Current state probability distribution – the probability distribution of being in any one state at the beginning of the process

We know the stages we can pass, the probability of moving from each path and we know the likelihood of the current state. This looks similar to a Markov chain.

In fact, this is an application of Markov chains. If we were to figure out the contribution of Channel 1 in our customers’ journey from start to finish, we would use the principle of removal of effects. The removal effect principle says that if we want to find the contribution of each channel in the customer journey, we can do so by removing each channel and seeing how many conversions would have occurred without that channel.

For example, let’s assume we have to calculate the contribution of channel C1. We will remove channel C1 from the model and see how many conversions occur without C1 in the Picture, i.e. the total number of conversions when all channels are intact. We calculate channel C1: Picture

Click on the title to view previous issues

Picture

Matlab Bayesian Hidden Markov HMM Model Implementation

Picture

Swipe left or right to see more

Picture

01

PicturePicture

02

Picture

03

Picture

04

Picture

P (conversion after removing C1) = P (C2→C3→conversion)

= 0.5 * 1 * 0.6

= 0.3

30% of customer interactions can convert without the C1 channel; with C1 intact, 45% of interactions can convert. Therefore, the removal effect of C1 is

0.3 / 0.45 = 0.666.

The removal effect of C2 and C3 is 1 (you can try to calculate, but intuitively, if we were to remove either C2 or C3, we would be able to complete any transformation?).

This is a very useful application of Markov chains. In the above case, all channels – C1, C2, C3 (at different stages) are called transition states; and the probability of moving from one channel to another is called transition probability.

A customer journey is a sequence of channels that can be viewed as a chain in a directed Markov graph, where each vertex is a state (channel/touchpoint) and each edge represents movement from one state to another. transition probability. Since the probability of reaching a state depends only on the previous state, it can be considered a memoryless Markov chain.

E-commerce company case study

Let’s do a real case study and see how we implement channel attribution modeling.

An e-commerce company conducted a survey and collected data from its customers. This can be considered a representative population. In the survey, the company collects data on customers’ visits to various touchpoints that ultimately lead to purchasing the product on its website.

There are a total of 19 channels where customers can encounter products or product advertisements. After 19 channels, there are three situations:

#20 – Customers decide which device to buy;

#21 – The customer has made the final purchase;

#22 – The client is undecided.

The overall classification of channels is as follows:

Category Channel
Website (1,2,3) Company’s website or competitor’s website
Research report (4,5,6,7,8) Industry Consulting research report
Online/Comments (9,10) Natural search, forum
Price Comparison (11) Aggregation Channels
Friends (12,13) Social Network
Experts (14) Online or offline experts
Retail stores (15,16,17) Physical stores
Others (18,19) Others, such as promotions in various places

Now, we need to help e-commerce companies determine the right strategy for investing in marketing channels. Which channels should you focus on? Which channels should companies invest in? We will use R to solve this problem in the next section.

Implementation using R

We read the data, try to implement it in R and check the results.

> head(channel)

Output:

1. R05A.01 R05A.02 R05A.03 R05A.04 ….. R05A.18 R05A.19 R05A.20

2. 16 4 3 5 NA NA NA

3. 2 1 9 10 NA NA NA

4. 9 13 20 16 NA NA NA

5. 8 15 20 21 NA NA NA

6. 16 9 13 20 NA NA NA

7. 1 11 8 4 NA NA NA

We’re going to do some data processing to bring it to a stage where we can use it as input in our model. We then determine which customers have made the final conversion.

Create a variable ‘path’ in a specific format that can be used as input to the model. Additionally, we will use the “dplyr” package to find the total number of occurrences of each path.

Path change

Path Conversion
1 > 1 > 1 > 20 1
1 > 1 > 12 > 12 1
1 > 1 > 14 > 13 > 12 > 20 1
1 > 1 > 3 > 13 > 3 > 20 1
1 > 1 > 3 > 17 > 17 1
> 1 > 6 > 1 > 12 > 20 > 12 1
1. > channel_fin = ddply(channel_fin,~path,summarise, conversion= sum(convert))

2. > head(channel_fin)

Output:

Path change

Path Conversion
1 > 1 > 1 > 20 1
1 > 1 > 12 > 12 1
1 > 1 > 14 > 13 > 12 > 20 1
1 > 1 > 3 > 13 > 3 > 20 1
1 > 1 > 3 > 17 > 17 1
1 > 1 > 6 > 1 > 12 > 20 > 12 1

Now we will create a heuristic model and a Markov model, combine the two, and examine the final results.

Output:

Output:

Output:

Channel Name First Contact Conversion ….. Linear Contact Conversion Linear contact value
1 130 73.773661 73.773661
20 0 473.998171 473.998171
12 75 76.127863 76.127863
14 34 56.335744 56.335744
13 320 204.039552 204.039552
3 168 117.609677 117.609677
17 31 76.583847 76.583847
6 50 54.707124 54.707124
8 56 53.677862 53.677862
10 547 211.822393 211.822393
11 66 107.109048 107.109048
16 111 156.049086 156.049086
2 199 94.111668 94.111668
4 231 250.784033 250.784033
7 26 33.435991 33.435991
5 62 74.900402 74.900402
9 250 194.07169 194.07169
15 22 65.159225 65.159225
18 4 5.026587 5.026587
19 10 12.676375 12.676375

Output:

:

:

Channel Name Overall Conversion Overall Conversion Value
1 82.482961 82.482961
20 432.40615 432.40615
12 83.942587 83.942587
14 63.08676 63.08676
13 195.751556 195.751556
3 122.973752 122.973752
17 83.866724 83.866724
6 63.280828 63.280828
8 61.016115 61.016115
10 209.035208 209.035208
11 118.563707 118.563707
16 158.692238 158.692238
2 98.067199 98.067199
4 223.709091 223.709091
7 41.919248 41.919248
5 81.865473 81.865473
9 179.483376 179.483376
15 70.360777 70.360777
18 5.950827 5.950827
19 15.545424 15.545424

Before we go any further, let’s first understand what some of the terms we’ve seen above mean.

First-touch conversion: A conversion through a channel when that channel is the first touchpoint for a customer. The first touchpoint gets 100% of the credit.

Last touch conversion: A conversion that occurred through a channel when that channel was the last touchpoint for a customer. 100% credit given to the last touch point.

Back to the R code, let’s merge the two models and represent the output visually.

1. # Draw total conversion

2. ggplot(R1, aes(channel_name, value, fill = variable)) +

3. geom_bar(stat='identity', position='dodge') 

Picture

The results can be clearly seen in the image above. From a first touch conversion perspective, Channel 10, Channel 13, Channel 2, Channel 4 and Channel 9 are very important; while from a last touch perspective, Channel 20 is the most important (because in our case customers decide which product to buy). In terms of linear contact conversion, Channel 20, Channel 4, and Channel 9 are important. Channels 10, 13, 20, 4 and 9 are very important from a total conversion perspective.

End

In the chart above, we have been able to figure out which channels are important for us to focus on and which ones can be ignored or ignored. This situation gives us a good understanding of the application of Markov chain models in the field of customer analysis. E-commerce companies can now create their marketing strategies more accurately and allocate their marketing budgets using data-driven insights.

Picture

Click “Read original text” at the end of the article

Get the full text and complete code data.

This article is selected from “R Language Uses Markov Chain to Model Channel Attribution in Marketing”.

Click on the title to view previous issues

Hidden Markov Model (HMM) identifies changing stock market conditions and practices stock index forecasting
Markov zone transfer model analysis of fund interest rate
Markov regime switching modelMarkov regime switching
Time-varying Markov regional switching MRS autoregressive model analyzes economic time series
Markov switching model studies traffic casualty accident time series prediction
How to implement the Markov Chain Monte Carlo MCMC model and Metropolis algorithm?
Matlab uses BUGS Markov zone conversion Markov switching stochastic volatility model, sequence Monte Carlo SMC, M H sampling analysis time series
R language BUGS sequence Monte Carlo SMC, Markov transformation stochastic volatility SV model, particle filtering, Metropolis Hasting sampling time series analysis
Matlab uses Markov Chain Monte Carlo (MCMC) Logistic regression model to analyze automotive experimental data
stata Markov Markov zone transfer model analysis of fund interest rate
PYTHON uses the time-varying Markov regional switching (MRS) autoregressive model to analyze economic time series
R language uses Markov chains to model channel attribution in marketing
Matlab implements MCMC’s Markov transformation ARMA-GARCH model estimation
Hidden Markov Models (HMMs) in R identify changing stock market conditions
Hidden Markov HMM model example in R language
Using machine learning to identify changing stock market conditions-Hidden Markov Model (HMM)
Matlab Markov Chain Monte Carlo method (MCMC) estimates stochastic volatility (SV, Stochastic Volatility) model
Markov regime switching model in MATLAB
Matlab Markov regional transition dynamic regression model estimates GDP growth rate
Markov regime switching model in R language
stata Markov Markov zone transfer model analysis of fund interest rates
How to make Markov switching model in R language
R language Hidden Markov Model HMM identifies stock market changes analysis report
Implementing Markov chain Monte Carlo MCMC model in R language

R language uses Markov chain to model channel attribution in marketing | With code data