'Big Data' - "Why" of the "What"...
‘Big Data’ is the buzzword today and not without reason, it’s a paradigm shift in the way we try to understand the world. Its utility arises from the fact that it will not only reduce cost but also help us identify those nuances which are not easily observable (hidden sub-processes, if you may). Consequently, as more and more data is being captured, a lot of resources are going into evolving tools for efficiently processing it. While this paints a very rosy picture of the future, assumption underpinning the entire concept has the risk of being forgotten and thus, cause more damage than benefits we expect out of it.
We need to understand that every data point in the vast ocean of ‘Big Data’ is in itself an output of an activity which in turn is consequent to some thought/behavourial process. This imparts a contextual meaning to each data point i.e. there exists some context associated with every such data point. So, in using those data points as an input to our model, it’s assumed that the context is known (this is the assumption which was referenced earlier). This is very important because ultimately the model gives us an output which are like pieces of a jigsaw puzzle, it’s our interpretation of the results which helps us understand the processes being studied - analogous to putting together the puzzle into one coherent piece. If we get this part wrong, we may get drastically different results (more often than nought, they will be on the wrong side of our expectation) from our actionables finalized upon basis the understanding we derived from our model.
When we say ‘context’ in reference to a data point, we are simply stating that we need to have an idea of the data generating process behind it. Or, in other words, we need to have an idea of ‘model of the world’ where ‘world’ is the environment (encapsulating the entirety) in which the process being studied takes place. It will help us in putting together the jigsaw puzzle together coherently rather than force fitting them into one. However, given the diversity of data points in itself and the fact that entire thing is being undertaken to understand the world, the assumption seems to present an insurmountable task. The only way out seems to be to learn as much as possible of how others have understood the world but probably time is the biggest constraint to it. Herein, thus, I would like to talk about a framework which can act as a starting point for getting to know the ‘model of the world’. Its primary advantage is that it involves ‘compounding of learning’ (for lack of better word), reducing time to learn significantly (still not much for everything).
Framework in itself consists of three parts –
- Underlying Foundation – ‘Oneness of Knowledge’
- Study Laboratory – ‘Nature/Society’
- Phenomenon – ‘Beauty’
The above framework is intended to guide us in trying to get an idea of 'model of the world', however, as was pointed earlier it's just a starting point.
The question as to how to implement the framework in real life, in my experience, is only through diverse experiences accumulated through reading, travelling or any other way that one chooses to. More importantly, there has to be no restrictions in terms of what is to be experienced - for example, in context of reading, non-fiction and fiction are equally effective
P.S. - I was introduced to idea of 'data generating process' and 'model of the world' by Dr. Anil Doshi.
Nice piece