Thinking in data
Data Vocabulary
Data Structure
When describing data structure it can be placed in three categories based on the sematic level of the structures
When discussing structure in data there are normally three types where XML and Json documents are described as Semi-structured.
But is that correct ?
The definition for Semi-structured is that you need an easily dicernable structure like in Semi-stuctured but when you add a schema you add a datamodel making it Structured
Data Consistency
When we are working with data we do not like to leave something in an inconsistent state. Inconsistent state can mean a lot of things
In computer science we try move data from a consistent state to another consistent state. We use a lot of different ways to do so by fx in file transfer evaluating the consistency in the start of the transfer with the consistency en the end. We also try to go smaller steps if possible continuing the example before we evaluate consistency on file chunks instead of the whole file. If it is transfered by TCP/IP ther is packet consistency checks.
Recommended by LinkedIn
Is Data Modelling is only for databases ?
The short answer is: No
The long answer is: You work with modelling much more than you are thinking about. It is actually when you modify structure or model working with Semi-structured and Structured data.
When you are:
Do you need a structured process ?
Probably not especially if you know the normalization process and are working on simple things such as simple data. It is a benefit to think about consistency of data even in NoSQL scenarios because of data ownership. Yes if it gets a bit more complicated or you create table proerties in a relational database. It is normally not done making the data model deteriorate making is more difficult to handle later.
The full-blown version of data structure normalization is used in database normalization. Normally I only use BCNF when working with data structure normalization because it is easier to go directly for that (https://en.wikipedia.org/wiki/Database_normalization) and from there I go for de-normalization as I usually say I go a higher normalization than I need to have control over the data owner.
Data Owner
I usually use a definition of data owner as the one location where developers are allowed to change data if the system copies the data automatically to other places or one service function with the responsibility.
I work with a data owner