How to Set Up a Rules Based Marketing Attribution Model

The objective of this post is to provide a how to guide on setting up a rules based marketing attribution model quickly, this solution takes a time series with a tracking ID, used python to manipulate and clean the data set and then leverages Tableau for visualisation. This solution allows the model to be created while also maintaining all of the contextual data related to the journey. 

I built this quickly in a couple of days, using a combination of an Analytics data feed (in this case it was from Adobe data warehouse), a short Python Script, and Tableau to Visualise the results.  

Just as a note this should be viewed as a first step to get you a working solution on which you can build on. It's by no means perfect and for speed of development I took some short cuts (which I make clear in this blog post) However, the power of this method is that you have complete control over the implementation and can modify it as you models develops. At the end of the post I outline some next steps on how to improve the model. 

Technology 

Like any recipe you are going to need the ingredients and the right equipment, in this case the data and the right tech. The below is an outline of the tools I used and why I chose them.  

Data Source - I took a data feed from Adobe Data Warehouse, the requirement for this method to work is to have a time series with the tracking ID that identifies the visitor. The added bonus of Data warehouse is that I have the Visit Number, which means I don't have to create this on the fly down stream in the Python script (which is totally doable). 

Python - As part of this project I needed a scripting language capable of cleaning and manipulating the data feed to be ready to be fed into Tableau. I chose python because it’s relatively easy and quick to write in (you pretty much write pseudo-code and you are there), it is also absolutely ubiquitous.   

Tableau - Tableaus is a great visitation tool, it's not really designed for ETL functions but there are several features recently released (Level of Detail Calculations) in the tool which make it perfect for adding additional value to the data file post Transformation phase. 

 

So in this post I am going to outline step by step the process. 

Step #1 
Outline the Attribution Methods (First Touch, Last Touch, Evens, Starter-Player-Closer, Temporal, and Spatial) 
Step #2  
How to get the file from data warehouse (and discuss other methods of getting the Data) 
Step #3 
The input File 
Step #4  
How to manipulate the file in Python. 
Step #5 
The output file. 
Step #6 
How to visualise the data in Tableau 

Outline Attribution Models 

Rule based attributions models are discussed in detail in tons of Blog Posts (here, here, and here). In short they are different methods used to allocating credit to a conversion event. If you want to find out more then please read the aforementioned blogs but for this case please look at the below table which summarises the models and how they all allocate credit.  

 

 

Monday 

Wednesday 

Thursday 

Thursday 

Friday 

 

PPC >> 

Email >> 

Social >> 

Display >> 

Conversion 

First Touch 

1.00 

 

 

 

1.00 

Last Touch 

 

 

 

1.00 

1.00 

Evens 

0.25 

0.25 

0.25 

0.25 

1.00 

Start-Player-Closer 

0.30 

0.15 

0.15 

0.40 

1.00 

Spatial 

4 

3 

2 

1 

1.00 

Temporal 

5 days 

2 days 

1 day 

1 day 

1.00 

  

Now, we have a customer, let’s call her Wendy, on Monday she interacts with a PPC Campaign, then on Wednesday, Thursday, and Friday interacts with an Email, Social, and Display Campaign respectively before finally converting on a Friday. The table below shows this user Journey and how credit would be awarded through each of the channels for the different attribution models. It's worth noting that the spatial attribution counts away from the conversion and so does the temporal, these models aim to tell you where and when the channel lies in relation to the marketing user journey.  

Now this is a summary of an individual user journey, I find it useful to not to look at how this raw data would be represented in a data file then we can go forward and create some assumptions and framework so we can apply the model. 

 

Phase 1: Read in file, add in information to allow grouping by tracking ID and ordering by time. 

So looking at the data file we need to run through the file and do a number of things (In the associated Python Data File I have flagged which bits of code do what) 

1.  Open Raw Data File and walk through line by line 

1.2   Create Visit-ID, this is a concatenation of the User-ID and the Visit ID to make a unique Visit ID, this is used in Tableau to count the number of Visits. 

1.3 For each Visit Flag the First instance of a tracking Code, there should be one per visit. 

1.4 Create a counter that adds in the Page View Number within a Visit (this is useful for analytics but also used in a sort key in the next phase. 

1.5 Output these to a file for use in phase two. 

 

Phase 2: This stage performs the grouping by tracking ID and ordering by time, it also flags the first conversion event for each visitor. 

2.1. Open file created from Phase 2 

2.2.  Read file into memory (this is a dirty solution for speed but there is a more scalbale solution but for the case of getting off the ground I recommend this) and apply the sort Python function. This used the sort key created in phase 1. 

2.3. Using the same method for first campaign touch in a visit we do the same this time but for the conversion event. 

2.4. Output the file with the first conversion flag, this should not be grouped by tracking ID. 

Phase 3: This is the most important phase and adds in all the crucial data required for the attribution, so it reads in the file backwards, looks for the conversion event, stored the timestamp of the conversion event in a variable, then on every touchpoint flag from phase two writes it to that row (this is then used in tableau to calculate the temporal model) every touchpoint it counts backwards so the spatial attribution model. This is then outputted to a file.  

Phase 4: Just reversed the file order, because I read in backward in phase three the file in back to front, so just do this phase to get it back into chronological order. This doesn’t actually need to be done as Tableau would sort it out, this is more for my own sanity looking at the data file for validation, I am sure this could be cleaner and done in memory, however, using larger data sets this would not be scalable. 

 

This is the end of the data manipulation over using Python, everything else can be done in Tableau which I am going to cover in the next section. However, I think it's worth looking at the example previously mentioned to get an idea of how the data structure now looks, there is a lot of additional fields outputted so you can see a sample here, the points to more is that how it counts backward from the conversion on the touch points and how the time of conversion is now copied to the touch point row, this is important for later. 

Ok, now we have our data, let’s look at how we can use this in Tableau: I have created a template Tableau file that can be downloaded here that reads in the afore mentioned file. 

 

For Tableau, we want to read in the data file then make some web metric friendly metrics and some useful dimensions like Visit and Visitor. These are all already in the Tableau file so you should be able to build it from there. All the relevant metrics and dims are in the file that allow you to do the attribution. 

 

You can download the files here and here,  please contact me if you have any questions at Phillip.law@yourfavouritestory.com

Think you know everything about Digital Marketing?
give our Digital IQ test a try! 

Written by Phillip Law