Conquering Data overload !!
We are drowning with A rapidly increasing rate of new Data being produced every second. Every single interaction between human and Machine, or between Machines, gives a birth to new data. In addition organisations have also shown flexibility in leveraging Data coming from outside their enterprise firewalls . This is seen as a significant differentiator to gain competitive advantage.
With more information coming from more sources, many organisations struggle to manage data’s exploding volume, diversity and complexity. Not only is storing data costly, but conventional data management practices can strain IT resources.
Organisation need to onboard right data strategy to conquer "Data overload".
1. Managing Data Deluge (MDD) : Data explosion has resulted in Volume, Variety, Velocity and Veracity. Likes of HDFS have solved the problem with data storage and cost but unfortunately IT teams are still spending lot of time on data collection, preparation, cleansing, harmonising and blending of data. Data deluge have also created a need for a robust data management platform which can support heterogeneous Data stores. There is no "one single tool" approach to solve this problem due to complexities involved. Below are key topics which need to be carefully addressed to architect a robust and scalable data analytics platform.
Architecture : A Future looking reference architecture that not only addresses today’s business problems but also provides flexibility and capabilities to solve tomorrow’s business problems!
Data Processing: You need to consider Diverse Ingestion methods to cater to sourcing of disparate data in a loosely coupled fashion. One hand we would need to consider traditional ETL to handle structured bulk data, on other hand we also need to think of tools supporting enrichment processes like NLP, Ontology Modeling to process social, devices, interactions, etc.
Data storage: Platform should have ability to store Hybrid Data types into same repository so that interact with each other for maximum value creation. How do we govern usage, access, processing and consumption pattern ? How do we address data security of external vs internal data.
2. Minimum viable Insight (MVI) : Minimum viable Insights is the Ability to get "Maximum Value" with existing datasets in shortest turn around time. This is only possible with test & learn sandbox and right data science skills. It is hard to find data science expertise who can discover and Produce new business insights in short turn around time (as short as few hours). Organisations are struggling to get right technology and domain expertise to unlock hidden opportunities and insights. Below are two key building blocks for better MVI.
Data & Insights : Integrating and blending disparate data sources is inherently a must have capability to gain new insights. Business needs a Quick sandbox setup for test & learn so that they can be quicker in applying insights into actions. It is also a challenge to continuous orchestration of insight generation which matches the speed of business thoughts and hypothesis and hence enough thought need to be put in terms of analytics delivery methods.
Consumption framework: It is not only about getting the right insights but also presenting those insights in a simple intuitive way so that insights get converted into actions. you need to have Multiple visualisation tool approach depending on the maturity of business community and target audience. traditional Bi reporting tool need to be supported with API enablement and advanced visualisation tools for enhanced self service Bi capability and visual data discovery to promote data experimentation culture.
organisations need to enhance their capabilities to manage the data deluge and same time continuously increase the analytics maturity to deliver meaningful and actionable insights so that they can continue to Maximise Information explosion...
source : Wales Higher Education Libraries Forum