What is Data
What is data? Seems like a simple dimwitted question. But what is it? A series of characters strung together to form a word? A group of words strung together to make a sentence, a paragraph, an essay, a book? To answer the question requires that the arduous effort of defining the purpose of a series of elements when put together that describe something. Without Merriam-Webster or Oxford English Dictionary then characters strung together and spoken would be nothing more than grunts.
So what is the point of this discussion? If you look at most software installations and try to find a set of definitions for the data elements used you normally would be hard pressed to find anything. Yet the entire business is being ran using data with no definitive description of what the attributes of information being maintained at great expense mean or are meant to represent. Therefore instead of having a defined attribute understood by all you have an attribute open to interpretation based upon each individuals connotation or belief of the meaning.
Take something as simple as a part number. What is a part number? How many characters are available for use? Part numbers were originally created to be able to catalog an item in, well a catalog. Some item which was capable of being bought and sold. In order to track the specific movement of said item it had to be called something. Before computers items could be cataloged simply by using a general term for the item: Hammer, Claw Hammer, Ball Pen Hammer, or Jack Hammer. These items were generally understood by all concerned and easily found or built. Now usher in the industrial revolution and we start requiring the need to catalog hundreds of items, store them, plan for them, purchase them, and build sub-assemblies and final assemblies. A more effective method was needed to better catalog items. I’m not going into anymore history or the merits of one part numbering system over another. I only want to make the point of, what is a part number? Are there a definition and a procedure written that describes:
- The limitation of characters that can be used?
- Can only numeric characters be used? It is called a Part Number.
- Is there significance to the characters and order of the characters?
- Do they have to be unique?
Google is a wonderful thing. Type a string of words – then press enter and back comes 103,000,000 results in 0.30 seconds. Then ask the question (to yourself not Google) – So what? Interesting reading, but did I have a purpose for the entry? For the case of this writing it was simply to see what would be returned. Now go to your system and just presume that it had a google like search. What would be returned if you entered in the phrase:
- What is my current inventory value?
- What type of material do I have the most of?
- What material do I sell the most of?
- What markets provide me the highest margin?
- What is my current backlog?
Now ask yourself how would the software have to be structured to provide a specific answer to any such question? With all the “Business Analytic s” software out on the market promising that they have these tools it should be the proverbial no brainer that every company of any size would have that software installed. The process for implementing these solutions requires, no demands that the arduous task of looking at the processes being supported, the set of attributes (also known as data fields) collected together to make up a record or row of data in a database are supposed to represent. What is required to sort and summarize some value that will provide meaningful data? I remember back when Information Technology used to be called Data Processing. Somewhere along the way the idea arose that data processing didn’t convey what was being done. So the term information systems was coined to justify all that data being processed provided information. But of course that wasn’t good enough because a system could be a steno pad or green ledger book. So Information Technology was born.
Now think about why, when you ask your information technology technicians to create a backlog report, you end up with nothing close to answering your question of, what is my backlog? Has it been made clear what the term “backlog” is meant to describe. I’ve been using some rather simple examples in this writing but when the questions become more complex, when you really want to do some serious data mining, if your data has no structure then 103,000,000 results in 0.30 seconds is what you’ll have fun pouring through trying to find something meaningful. Have fun.
I've had clients who considered a several gigabyte object to be a single data item - a movie stored in digital format!!! I've had clients who kept track of the constant polling number from an electronic device that was a bit in length. Data as an asset is described and defined by business need. It never ceases to amaze me how companies who's livelihood is dependent on their data consider the maintenance of said assets as a liability!