Original link: http://tecdat.cn/?p=5383
Recently we were asked by a client to write a research report on Markov chains, including some graphical and statistical output.
In this article, we look at what channel attribution is and how it relates to the concept of Markov chains
We will also understand how this concept works theoretically and practically (using R) through a case study of an e-commerce company.
What is channel attribution?
Google Analytics provides a standard set of rules for attribution modeling. According to Google, “An attribution model is a rule or set of rules that determines how sales and conversions are assigned to touchpoints in a conversion path. For example, the last interaction model in Google Analytics assigns Touchpoints (i.e., clicks) are assigned 100% credit. In contrast, the first interaction model assigns 100% credit to the touchpoint that initiated the conversion path.”
We will see the last interaction model and the first interaction model later in this article. Before that, let’s take a small example to understand more about channel attribution. Suppose we have a transformation graph as shown below:
In the above scenario, customers can start their journey through channel ‘C1’ or channel ‘C2’. The probability of starting with C1 or C2 is 50% (or 0.5). We first calculate the overall probability of conversion and then look further at the impact of each channel.
P(conversion) = P(C1→C2→C3→conversion) + P(C2→C3→conversion)
= 0.5 * 0.5 * 1 * 0.6 + 0.5 * 1 * 0.6
= 0.15 + 0.3
= 0.45
Markov chain
A Markov chain is a process that maps activities and gives probability distributions to move from one state to another. Markov chains are defined by three properties:
State space – the set of all possible states that may exist
Transition operation – the probability of moving from one state to another
Current state probability distribution – the probability distribution of being in any one state at the beginning of the process
We know the stages we can pass, the probability of moving from each path and we know the likelihood of the current state. This looks similar to a Markov chain.
In fact, this is an application of Markov chains. If we were to figure out the contribution of Channel 1 in our customers’ journey from start to finish, we would use the principle of removal of effects. The removal effect principle says that if we want to find the contribution of each channel in the customer journey, we can do so by removing each channel and seeing how many conversions would have occurred without that channel.
For example, let’s assume we have to calculate the contribution of channel C1. We will remove channel C1 from the model and see how many conversions occur without C1 in the Picture, i.e. the total number of conversions when all channels are intact. We calculate channel C1:
Click on the title to view previous issues
Matlab Bayesian Hidden Markov HMM Model Implementation
Swipe left or right to see more
01
02
03
04
P (conversion after removing C1) = P (C2→C3→conversion)
= 0.5 * 1 * 0.6
= 0.3
30% of customer interactions can convert without the C1 channel; with C1 intact, 45% of interactions can convert. Therefore, the removal effect of C1 is
0.3 / 0.45 = 0.666.
The removal effect of C2 and C3 is 1 (you can try to calculate, but intuitively, if we were to remove either C2 or C3, we would be able to complete any transformation?).
This is a very useful application of Markov chains. In the above case, all channels – C1, C2, C3 (at different stages) are called transition states; and the probability of moving from one channel to another is called transition probability.
A customer journey is a sequence of channels that can be viewed as a chain in a directed Markov graph, where each vertex is a state (channel/touchpoint) and each edge represents movement from one state to another. transition probability. Since the probability of reaching a state depends only on the previous state, it can be considered a memoryless Markov chain.
E-commerce company case study
Let’s do a real case study and see how we implement channel attribution modeling.
An e-commerce company conducted a survey and collected data from its customers. This can be considered a representative population. In the survey, the company collects data on customers’ visits to various touchpoints that ultimately lead to purchasing the product on its website.
There are a total of 19 channels where customers can encounter products or product advertisements. After 19 channels, there are three situations:
#20 – Customers decide which device to buy;
#21 – The customer has made the final purchase;
#22 – The client is undecided.
The overall classification of channels is as follows:
Category | Channel |
---|---|
Website (1,2,3) | Company’s website or competitor’s website |
Research report (4,5,6,7,8) | Industry Consulting research report |
Online/Comments (9,10) | Natural search, forum |
Price Comparison (11) | Aggregation Channels |
Friends (12,13) | Social Network |
Experts (14) | Online or offline experts |
Retail stores (15,16,17) | Physical stores |
Others (18,19) | Others, such as promotions in various places |
Now, we need to help e-commerce companies determine the right strategy for investing in marketing channels. Which channels should you focus on? Which channels should companies invest in? We will use R to solve this problem in the next section.
Implementation using R
We read the data, try to implement it in R and check the results.
> head(channel)
Output:
1. R05A.01 R05A.02 R05A.03 R05A.04 ….. R05A.18 R05A.19 R05A.20 2. 16 4 3 5 NA NA NA 3. 2 1 9 10 NA NA NA 4. 9 13 20 16 NA NA NA 5. 8 15 20 21 NA NA NA 6. 16 9 13 20 NA NA NA 7. 1 11 8 4 NA NA NA
We’re going to do some data processing to bring it to a stage where we can use it as input in our model. We then determine which customers have made the final conversion.
Create a variable ‘path’ in a specific format that can be used as input to the model. Additionally, we will use the “dplyr” package to find the total number of occurrences of each path.
Path change
Path | Conversion |
---|---|
1 > 1 > 1 > 20 | 1 |
1 > 1 > 12 > 12 | 1 |
1 > 1 > 14 > 13 > 12 > 20 | 1 |
1 > 1 > 3 > 13 > 3 > 20 | 1 |
1 > 1 > 3 > 17 > 17 | 1 |
> 1 > 6 > 1 > 12 > 20 > 12 | 1 |
1. > channel_fin = ddply(channel_fin,~path,summarise, conversion= sum(convert)) 2. > head(channel_fin)
Output:
Path change
Path | Conversion |
---|---|
1 > 1 > 1 > 20 | 1 |
1 > 1 > 12 > 12 | 1 |
1 > 1 > 14 > 13 > 12 > 20 | 1 |
1 > 1 > 3 > 13 > 3 > 20 | 1 |
1 > 1 > 3 > 17 > 17 | 1 |
1 > 1 > 6 > 1 > 12 > 20 > 12 | 1 |
Now we will create a heuristic model and a Markov model, combine the two, and examine the final results.
Output:
Output:
Output:
Channel Name | First Contact Conversion | ….. | Linear Contact Conversion | Linear contact value |
---|---|---|---|---|
1 | 130 | 73.773661 | 73.773661 | |
20 | 0 | 473.998171 | 473.998171 | |
12 | 75 | 76.127863 | 76.127863 | |
14 | 34 | 56.335744 | 56.335744 | |
13 | 320 | 204.039552 | 204.039552 | |
3 | 168 | 117.609677 | 117.609677 | |
17 | 31 | 76.583847 | 76.583847 | |
6 | 50 | 54.707124 | 54.707124 | |
8 | 56 | 53.677862 | 53.677862 | |
10 | 547 | 211.822393 | 211.822393 | |
11 | 66 | 107.109048 | 107.109048 | |
16 | 111 | 156.049086 | 156.049086 | |
2 | 199 | 94.111668 | 94.111668 | |
4 | 231 | 250.784033 | 250.784033 | |
7 | 26 | 33.435991 | 33.435991 | |
5 | 62 | 74.900402 | 74.900402 | |
9 | 250 | 194.07169 | 194.07169 | |
15 | 22 | 65.159225 | 65.159225 | |
18 | 4 | 5.026587 | 5.026587 | |
19 | 10 | 12.676375 | 12.676375 |
Output:
:
:
Channel Name | Overall Conversion | Overall Conversion Value | |
---|---|---|---|
1 | 82.482961 | 82.482961 | |
20 | 432.40615 | 432.40615 | |
12 | 83.942587 | 83.942587 | |
14 | 63.08676 | 63.08676 | |
13 | 195.751556 | 195.751556 | |
3 | 122.973752 | 122.973752 | |
17 | 83.866724 | 83.866724 | |
6 | 63.280828 | 63.280828 | |
8 | 61.016115 | 61.016115 | |
10 | 209.035208 | 209.035208 | |
11 | 118.563707 | 118.563707 | |
16 | 158.692238 | 158.692238 | |
2 | 98.067199 | 98.067199 | |
4 | 223.709091 | 223.709091 | |
7 | 41.919248 | 41.919248 | |
5 | 81.865473 | 81.865473 | |
9 | 179.483376 | 179.483376 | |
15 | 70.360777 | 70.360777 | |
18 | 5.950827 | 5.950827 | |
19 | 15.545424 | 15.545424 |
Before we go any further, let’s first understand what some of the terms we’ve seen above mean.
First-touch conversion: A conversion through a channel when that channel is the first touchpoint for a customer. The first touchpoint gets 100% of the credit.
Last touch conversion: A conversion that occurred through a channel when that channel was the last touchpoint for a customer. 100% credit given to the last touch point.
Back to the R code, let’s merge the two models and represent the output visually.
1. # Draw total conversion 2. ggplot(R1, aes(channel_name, value, fill = variable)) + 3. geom_bar(stat='identity', position='dodge')
The results can be clearly seen in the image above. From a first touch conversion perspective, Channel 10, Channel 13, Channel 2, Channel 4 and Channel 9 are very important; while from a last touch perspective, Channel 20 is the most important (because in our case customers decide which product to buy). In terms of linear contact conversion, Channel 20, Channel 4, and Channel 9 are important. Channels 10, 13, 20, 4 and 9 are very important from a total conversion perspective.
End
In the chart above, we have been able to figure out which channels are important for us to focus on and which ones can be ignored or ignored. This situation gives us a good understanding of the application of Markov chain models in the field of customer analysis. E-commerce companies can now create their marketing strategies more accurately and allocate their marketing budgets using data-driven insights.
Click “Read original text” at the end of the article
Get the full text and complete code data.
This article is selected from “R Language Uses Markov Chain to Model Channel Attribution in Marketing”.
Click on the title to view previous issues
Hidden Markov Model (HMM) identifies changing stock market conditions and practices stock index forecasting
Markov zone transfer model analysis of fund interest rate
Markov regime switching modelMarkov regime switching
Time-varying Markov regional switching MRS autoregressive model analyzes economic time series
Markov switching model studies traffic casualty accident time series prediction
How to implement the Markov Chain Monte Carlo MCMC model and Metropolis algorithm?
Matlab uses BUGS Markov zone conversion Markov switching stochastic volatility model, sequence Monte Carlo SMC, M H sampling analysis time series
R language BUGS sequence Monte Carlo SMC, Markov transformation stochastic volatility SV model, particle filtering, Metropolis Hasting sampling time series analysis
Matlab uses Markov Chain Monte Carlo (MCMC) Logistic regression model to analyze automotive experimental data
stata Markov Markov zone transfer model analysis of fund interest rate
PYTHON uses the time-varying Markov regional switching (MRS) autoregressive model to analyze economic time series
R language uses Markov chains to model channel attribution in marketing
Matlab implements MCMC’s Markov transformation ARMA-GARCH model estimation
Hidden Markov Models (HMMs) in R identify changing stock market conditions
Hidden Markov HMM model example in R language
Using machine learning to identify changing stock market conditions-Hidden Markov Model (HMM)
Matlab Markov Chain Monte Carlo method (MCMC) estimates stochastic volatility (SV, Stochastic Volatility) model
Markov regime switching model in MATLAB
Matlab Markov regional transition dynamic regression model estimates GDP growth rate
Markov regime switching model in R language
stata Markov Markov zone transfer model analysis of fund interest rates
How to make Markov switching model in R language
R language Hidden Markov Model HMM identifies stock market changes analysis report
Implementing Markov chain Monte Carlo MCMC model in R language
R language uses Markov chain to model channel attribution in marketing | With code data