5 steps to build a data science team
Building a World-Class Analytics Center from the Ground Up
Original presentation to ANA in NYC was on December 9, 2015.
Executive Summary
A strong analytics team is essential for any company that wishes to use data to optimize business strategies and keep up with changes in the marketplace. In the paper below I discuss the five steps to take to enhance an existing analytics program or build one from the ground up.
Insights
Having a robust analytics team is essential for companies that wish to use data to improve their marketing strategies and achieve long-term business goals. But the task of putting together a team can be challenging. How do you assemble a staff who can hit the ground running? What tools do you need to mine data and measure it? How do you present the results to influence decision-making? There are five steps that companies can take to build a successful, well-rounded analytics team from the ground up — or improve an existing one.
No. 1: Choose the right structure.
Depending on the needs of your company and how it is structured, there are three different types of operating models that can be used to structure a new analytics program:
- The Federal Approach: With this approach, data analysts report up from different departments within the organization. The benefit of this structure is that the analysts are knowledge experts and are fully immersed in the day-to-day activities of their departments. However, because they are scattered across the company, there is often very little communication between them, and as a result, there’s a lack of consistency (especially in regards to methods and definitions) across the board.
- The Centralized Approach: Using this method, there is a core group of analysts who report from within a single department. The upside to this model is that the analysts are able to leverage their different skill sets, cross-train, and employ the highest levels of consistency across the organization. The downside is that this group is often isolated from other departments and is not entrenched in what is happening day to day.
- The Hub and Spoke Approach: This approach is essentially an amalgamation of the first two in which the analysts report from different departments, but also form a core group. This can help organizations efficiently analyze data in a consistent way while breaking down some of the communication silos.
No. 2: Build the best team.
After implementing a structure for analysts to work within, it’s important to identify the roles that need to be filled and to build a team with analysts who are able to manage all of the key areas. When it comes time to hire your team, you’ll need to find people to fill the following roles:
- Leadership: The responsibilities of the leader are to oversee the process, train the staff, prioritize the projects, and present results to the C-Suite. A new analytics team absolutely needs a leader who possesses strong mathematical modeling skills. The reason is simple: Mathematical modeling skills are hard to learn and require years of experience working under experts. While data mining and business savvy skills are certainly valuable, these should ultimately be secondary considerations, since they are skills that can be easily learned. Using actuaries to fill the leadership positions can be beneficial since these professionals are trained to analyze the consequences of risk and use mathematics, statistics, and financial theory to study uncertain future events and predict the future.
- Statistical Analyst: This individual tests for significance in KPI changes, sets up unbiased test/control groups, and does all of the modeling and forecasting. A qualified candidate should have up to three years of practical experience, with a master’s degree in applied statistics. If a company is building a program from the ground up, it should foster relationships with local universities and hire promising candidates directly from those graduate programs.
- Financial Analyst: The primarily job of this individual is to incorporate all of the relevant business information (such as ROIs and Interest Rate Theory) into the analysis. Ideally, candidates would have at least three to five years of experience, with a substantial amount of ad-hoc work on their resumes.
- Data Analyst: This person will extract and clean the data for the rest of team. It’s important to have one member of your team dedicated to just this task, since normal analysts can spend up to 90 percent of their time cleaning and validating data instead of doing a proper analysis of it.
No. 3: Measure for success
You can’t analyze data without measuring something. But how your newly formed analytics teams begins to measure data is largely dependent on how that data is currently organized at your company. Mostly likely, your company’s Information Systems department is already tracking everything, and storing that data in tables on an SQL server with the IT department. The collection of those tables is called a data warehouse.
The data warehouse, however, is usually hard to access, poorly labeled, and unstructured — which is why it’s essential to create a data dictionary. A data dictionary labels the data stored in the warehouse in a meaningful way and identifies relationships between the data. When first getting off the ground, your analytics team can start defining data and its associations piecemeal, with either Word or Excel tables, and refine the focus based on the company’s needs.
Only with a clearly defined data dictionary can your IT or IS teams build data marts. Data marts are smaller versions of the data warehouse with clearly labeled data and relationships. Without a data mart, your analytics team won’t be able to use business intelligence tools (like Tableau) to retrieve, analyze, transform, and report data. (Instead, they would have to learn SQL to access the data.)
No. 4: Put insights into action
Once the analytics team has identified insights from their analysis of the data, what are the next steps? First, the team will want to do some pre-launch analysis to ensure their insights are causal and will happen again. The easiest way this can be tested is through holdout sampling. Essentially, this means that analysts would withhold a part of the data in the modeling process and build a model off of that partial data set. Once they have a final model, they would use the withheld data for the last test.
After the team has modeled the insights, they can move forward to the test-and-learn phase, which consists of three different parts:
- The Disruption Test: The team would start with a hand-selected group of locations or regions and test the changes. They would work with the affected teams to learn how these insights pan out and make any necessary adjustments.
- The Pilot Test: Next, the team would roll out the adjusted campaign to a statistically significant sample group. (Beforehand, be sure to set clearly defined KPI targets and have your statistical analyst find unbiased control and test groups.) Once the test is complete, analyze the pilot results to segment the winners from the losers.
- The Intelligent Roll out: Using the information from the pilot test, roll out the new campaign to the winning segment (if possible). Make sure to set up any follow-up reporting measures before the roll out.
For a team that is just getting its feet wet, some of the first analyses and tests should focus on segmentation and seasonality. Segmentation is unique in every industry, and the analytics team can use Customer Lifetime Value to examine consumer behavior across various segments and brainstorm strategies from the data results. In terms of seasonality, understanding how different consumer segments behave over time can help with promotion strategies. For instance, some customers are activated more strongly at different times, and knowing that in advance can help the company plan on how to allocate resources.
No. 5: Building on success
With any new venture, it’s often a challenge to gain immediate credibility. Companies need to take steps to build a culture in their organization that takes analytics seriously:
- Governance: Report through the CFO rather than the CMO, if possible, in order to have a better opportunity of being taken seriously and influencing more executive level decisions.
- Consistency and Repeatability: Make sure that each department (as well as any vendors or agencies) consistently defines metrics in the same way.
- Sharing Insights: Start with easy wins by finding data which backs up an idea that is already believed by management. Being able to affirm what management already believes and give them actionable information to make decisions is one of the easiest ways to start building your team’s credibility.
Question and Answer
- Why do you suggest reporting through the CFO rather than the CMO?Even though the marketing department is our top customer, I prefer keeping them at arm’s length. Everything in the marketing department needs to happen immediately, so keeping some distance between them and the analytics team allows the analysts to manage the workflow more efficiently.
- Can you talk a little bit more about who shares the results of your analytics department in your organization and how they do it? Right now, I handle presenting all of the results. For a lot of data scientists, it’s something that definitely needs to be taught, and there is always a transition involved. It’s all about understanding the audience and figuring out how to communicate the most relevant details in the easiest way possible. But these skills are something that you can train your analytics team to do, and it’s something that’s much easier to train than quantitative skills. In the beginning, you can always have someone outside of the analytics department present the results, but some of details might be lost in translation.
- What is your take on external versus internal modeling? The problem with using external modelers is that they don’t have the rich understanding or details about customers that your internal teams may have. Internal modeling gives you much wider access and a richer knowledge base to work with.
- How do you field background research? Our marketing department does focus testing, which our team gets the results from. All of the modeling that our team does uses internal data. We use external data vendors to gather data when looking at prospective customers.
- What do you think about BI tools? Some find BI tools to be invaluable since they don’t have the background to build it from the ground up. I personally don’t like them and prefer to build my own reporting tools internally. There is no right answer, however. It all depends on the structure and needs of your organization.
You can find my articles here on LinkedIn or the Huffington Post where I regularly write about Data Science, Machine Learning, and Team Building. If you would like to read future posts please click 'Follow' and feel free to also connect via Twitter, Facebook, Instagram, and Google+.
Well-written and very informative, thank you for the link.
Call me lucky but I was in a position where prior to building the reporting I helped define parameters and location of the information then the guy would tell me if he could do it or go back to his desk to find a way. It takes a collaborative effort in any structure and I do find building your own reporting can be as bad as BI at the end of the day because sometimes you end up with too much data either way which can hinder defining action items. You pull data and at times it is either all or nothing depending on the location of the data and you have to take time to muddle through it from my experience.
Nice article Christopher.
Thanks Christopher. Very interesting
Very informative!