Analytical architecture evolution
This is the first of a series of posts where I'll share my thoughts on the evolution of the analytical architecture required to support marketing campaigns (although the architecture supports other use-cases equally well!)
The motivation for writing this series is to connect with my extended network, share ideas and hopefully pick up some new ideas along the way! Please comment, share and connect!
Part 1: The traditional approach and the motivation for change
A traditional architecture would normally consist of a number of monolithic applications hard-wired together, including (but not limited to) an ETL tool, a relational database (usually with a number of dependent datamarts), statistical software (for building predictive models), CRM software connected to various customer channels and reporting software.
Historically there hasn’t been much abstraction between the various components, which means the data structure is normally fixed and hard-wired across the various components. For example an ETL developer would build a job to bring in a data source into a staging table with a fixed schema, then integrate this data with other tables in the database to construct the final table (i.e. a CRM datamart).
This data structure is then mapped directly into the CRM application to allow analysts to build campaigns and leverage the power of the CRM application. At some point predictive models might get added into the mix (either through the database ETL route or directly into the CRM application).
After the campaigns have been built, the data is loaded back into the database or more frequently into the statistical software (usually the path of least resistance) to fulfill the reporting requirement.
‘Now change it’ – imagine we want to add a new piece of data into this process. We would need to manage the change across multiple points in the process: ETL jobs, database objects, mapping into the CRM application and the reporting application. This is a lot of change and will have a large testing requirement.
How many times have we been told it would take 3 months of effort to add a single column into this type of architecture?
In part 2 I’ll introduce a different architectural approach which starts to address some of these challenges…
And this would be circa 2000-2003 I guess. When the only solution known is an integrated data warehouse approach.
Kathryn Buttler
Great article and insights
To my undstanding, bringing new piece of data can be developed in weeks or shorter with modern tools. It is also depended on where the change is, the further along the data processing lifecyle, the easier (e.g. source data structural change vs. derived attribute change vs. report attribute change). The hard part (most time consuming) is to assess the impact for existing processes and perform just-enough testing. Proper data lineage management or automatic integration testing would help to make that easier. Looking forward to part 2 : )