DataOps
I wrote a blog about DataOps, but I think this is truly just the beginning.
What is DataOps?
DataOps are the people that are asked the following questions (List is not exhaustive):
1. Why does this query run so slow?
2. Did the job run last night?
3. Why do we need more storage? Didn't we get enough last time? (This is a trick question. Data Product owners should be doing cost justification. DataOps implements the most optimal storage infrastructure.)
4. Do we have a backup of that data somewhere? (This is a scary question to hear.)
5. Can you pull this data from S3, that data from Oracle, this other data from SQL Server, and drop it in a CSV file somewhere so I can analyze it?
6. Tableau/Cognos/Business Objects/Excel is having difficulty reading from Hadoop/Oracle/MySQL/SQL Server, can you take a look and make it work?
7. Are we on the latest patch release?
8. Is this data current?
9. How long would it take to build a Cassandra/Hadoop/Oracle/SQL Server/MySQL cluster?
10. Can you just load this data somewhere so I can take a look at it?
11. What do you do all day?
Some of these are snarky, but some of these questions I have actually seen asked of a DataOps team.
DataOps makes sure the Data Specific infrastructure is working as it should.
DataOps makes the data work.