Creating a Content Recommendation Engine using R, OpenCPU and GTM

Skrevet af Mark Edmondson d. 10 mar 2016, 13:16


Breaking down your data analysis silos: how to link R and Google Tag Manager with OpenCPU

A weekend with the Measurecampers

Last weekend I got to go to MeasureCamp London which as always was a great event, leaving me with lots to think about on how to improve analytics for our clients going into 2016.

Measurecamp is an “unconference” where the focus is on knowledge exchange rather than a fixed agenda.  At the start of the day a blank session board is created, and the attendee’s themselves fill in session cards to talk in each 35 minute time slot.  The format means sessions that are as good as the attendees themselves, which in MeasureCamp’s case are the top analytics practitioners in Europe and beyond.

Four people from IIH Nordic gave presentations which ranged from “Enterprise Analytics - making it big” from Peter, “Conversion on 3rd party party websites” from Florian and “Compared to What?” analytics by Steen.  

My presentation was in the same vein as my last Measurecamp sessions, focusing on how R can enhance the website data model.

R is an open-source statistical programming language that is becoming more and more popular as the drive for data analytics grows.  There are an estimated two million R users worldwide in 2016 of which 40% were added last year.  The open-source nature and focus on making tools for data means that libraries are available that cover all steps of a data processing pipeline: collection, ETL, statistics, machine learning, visualisation and more are covered.

 

Connecting data analysis silos

One challenge in creating analysis that makes an impact to your business is data sitting in its own department silo, which can also mean data analyses unconnected to the broader picture.

For example, your CRM team may have created advanced churn models and customer lifetime value calculations that are applied to your email segments, but the same analysis isn’t applied when that customer arrives on your website, to personalise offers and increase conversion.

A good step forward is to link datasets offline to make conclusions you can apply across all departments.  But wouldn’t it be better if your offline analyses could be linked real-time to your web data, creating predictions or adding meta data quickly enough you could show it to the user as they browse?  

A barrier to this may be the different disciplines needed by the people working in your business to enact.  A web developer skill set is quite different from a data analyst.  But if you could take the output of your data analyses (say an R function) and present it in a manner your web developers can use (say a JavaScript JSON array) then you get to leverage each of their strengths.

The aim of my presentation was to show how this could be done by marrying the worlds of R and the data layer of Google Tag Manager.

 

Linking R with Google Tag Manager via OpenCPU

OpenCPU, is a service that turns R code into a JSON API that JavaScript can call. It bills itself as a framework for embedding scientific computing, and is part of the growing sector of RaaS (R-as-a-service) resources out there that promise you R capabilities wherever you need them.

My talk went through the technical data architecture to link R with Google Tag Manager, using a content recommendation engine as an example.  A live demo is also included in the documentation.

Fig.1 - Data Architecture for calling R from GTM


It demonstrates how a Markov model could be generated in R, that predictive model uploaded to OpenCPU, and then queried in under a second from Google Tag Manager to present a prediction to the user as they browse a website.

Fig.2 - Example Markov model for predicting next pageview


Seeing it work live in Google Tag Manager really bought home what could be achieved, and how relatively quickly it can be done verses say off-the-shelf bought solutions.  It was exciting to realise the possibilities!

The (technical) presentation from the day is shown below:


If you’d like to explore the code and see the live example spoken about in the presentation, the model and a website with copy-paste code is available here: http://code.markedmondson.me/predictClickOpenCPU/

It is hoped that the talk is at least inspiring about what can be done using this approach, even if only certain elements are used.


 

Taking it further

With this method, potentially all the offline analysis results performed in R we can apply real-time to a live user on the website.  Any data analysis that can’t be done within JavaScript can be made available to front-enders building the website with them not needing to have any knowledge of R.

I like this as it gives us a path up the analytics analysis mountain: from reporting to prediction to prescription to automation.

Linking to other data sources, using machine learning libraries or making predictions and suggestions are now all possible embedded within a website’s data model.  I look forward to what you come up with, let me know on Twitter at @HoloMarked if you think of any good examples!

 



Synes du dette indlæg var spændende?

Del det med dine venner på de sociale medier




Efterlad en kommentar


Hold dig opdateret

Få seneste nyt om indhold og events direkte i din indbakke