The practical guide to building an analytics platform
Over the next few weeks I'm going to be writing a bunch of material to help people on their big data and analytics journeys. I've spent the last few years implementing big data and analytics projects and I'm going to pull together all of the things I've learned to do (and not do!) here.
This series aims to provide a comprehensive "how to" guide to building a successful, vibrant analytics platform that people will be clamouring to use.
As well as this, I'll also be looking at how you gear up the platform to integrate with other systems in order to make use of all the clever analytics in real world applications, such as data driven marketing campaigns or predictive supply chain analytics.
Whilst I will cover a range of topics, this series is particularly aimed at those responsible for doing the design and build of your big data platform and those that will consume it, such as analysts and data science teams.
It will be a collection of pragmatic advice covering a range of topics, both technical and non-technical including...
How to run the team
I'll be looking at the leadership, team structure, skills, processes and technologies you need to be successful. Without a strong foundation in this area everything else will be an uphill struggling. This is a part many people get wrong from the start and I'll provide some practical ways to structure and run a delivery team. I'll also look at some of the key technologies you're going to need in your tool belt.
How to ingest and store your raw data
How you go about getting data into the platform and how you store all of that raw data is vital to get right. Building a solid ingestion layer that's rapid to deploy and easy to reuse will save you a lot of time in the long run; both in future deliveries and when it comes to support. This will also be the place where your raw data will live in the long term so I'll talk about the right way to store it that makes it valuable for users, testing and historic rebuilds.
How to model your data
On top of the raw data sits the base and analytic layers - the places where we start to model the data and make it presentable to users. I see a lot of confusion when it comes to how to handle this part of a big data or analytical platform and there are a number of new and competing methodologies to choose from. I'll help you come up with simple modelling approaches that combines the best of everything (and a few new things too).
How to access your data
I'll finish by talking about the provisioning of data and take a look at some of the great tools you need in your arsenal. Too often people fall at this final hurdle and all the great work they've got is undone because everyone gets frustrated accessing it. It's not just end users you need to think about either. Whether it's batch file sends to an external partner or API access for website personalisation, it's vital to ensure that you have a variety of different mechanisms for unlocking all the value you have just created.
When put together, all of the above should put you in good stead to go and build great analytics and big data platforms that people will love to use.
I'd welcome suggestions for topics to include, so please feel free to send them over to me and I'll try and work them in.
If you would like to talk to me about defining your data and analytics strategy then please get in touch.
You can also follow my company, Cynozure, on LinkedIn, Twitter and at our website.
UPDATE
Here are links to the articles:
Good stuff James Lupton looking forward to our consultation in a couple of weeks to see what you make of our set up!
I keep clicking "Refresh" already waiting for the next instalment! As someone just finalising a high level solution architecture covering all the aspects above I am very much looking forward to reading this.