Data Processing Over the Decade
Data Processing Over The Decade - Part 1

Data Processing Over the Decade

This article is about the learnings, challenges, and how we solved them over a decade of my journey in different stages of my career being a Developer, Lead, Manager, Java Developer, Database Developer, DataWarehouse/ETL Developer, BI/Report Developer, Data Modeler, Database Administrator, Database Architect, Data Analyst , Product Analyst, Data Engineer etc and the journey continues.

Use case 1 - OLTP vs OLAP

  • Problem Statement -

Early Days of my career as a software developer wanted to retrieve the data from the same database as we are recording on it. Although we are Able to do it in some cases but largely it made our application down during retrieval of various reports such as monthly, yearly report etc.

  • Solution -

we need to copy all the data from source database i.e, OLTP (OnLine Transaction Processing) to a new Database i.e OnLine Analytical Processing (OLAP) Database. which later on we have used for all the reporting requirements.

  • Learning -

OLTP is good for recording the transaction. It's not meant for retrieval or analysis of large amount of the data and can be considered as a write only Database.

OLAP can be used for Business Intelligence reporting, analysis purpose and can consider as a read only Database.

Use case 2 - ETL

  • Problem Statement -

We have started copying the data from OLTP to OLAP in every one hour. Extract the transactions happened during last hour from OLTP Schema and copied into OLAP Database schema based on the Modified Date column. But we have observed sometime the operation happened effortlessly and other times OLTP database got locked , retrieving the transactions took time, Application slowness and the operation was again a never ending task.

  • Solution -

We have changed the Extract strategy from Hourly to Daily. Instead of retrieving the transactions on hourly basis we have retrieved it on daily basis as a batch during mid night which fetched all the transactions happened on previous day.

  • Learning -

Data Extraction from OLTP should not be done during lot of writes happening on the database i.e termed as peak hours. Instead of it we can select the data in off peak hours when there are less write typically in our case i.e, MidNight. We need to make sure the column's used in filter criteria needs to be indexed properly to make the data retrieval faster. Although now there are read only replicas and other different strategies come up for data retrieval over the period of time.

Note :- The above is few of the basic problem statements will continue to share more use cases in a series of articles.

To view or add a comment, sign in

More articles by Dipti Pasupalak

  • Databricks: The One-Stop Solution for Your Lake House Architecture

    Databricks: The One-Stop Solution for Your Lake House Architecture Introduction In today’s data-driven world, the need…

    2 Comments
  • Revolutionizing Bangalore's Traffic: A Data-Driven Journey to a Congestion-Free Future

    Case Study: Addressing Bangalore's Traffic Congestion Problem Using Data and AI. Use Case: Imagine we are tackling the…

  • Work Life Balance 👩💻⚖️❤️

    Achieving a better work-life balance is an ongoing process that requires commitment and adjustment same methodology may…

  • AI For Good

    It’s fascinating to witness the recent buzz around AI, as individuals ranging from professionals in various fields to…

  • Data Platform

    Credit:SIphotography Remember the days when we have only on-premises databases? The days when we try to retrieve the…

  • VSCode: ChatGPT extension

    I am more familiar with Databricks, PyCharm editors than VSCode. But recently during the usage I have used the…

  • Building a No-Code ETL Solution: Incremental Data Loading from AWS S3 to Snowflake using Snowpipe

    Introduction: In this article, we'll explore a simple and practical approach to move incremental data from AWS S3 to an…

    4 Comments
  • The Big Data Battle.......

    When the world is rely on Data than processing of the Big Data definitely places a huge role on it. We can't ignore the…

    2 Comments
  • Data Storage

    This article shares the basic characteristics of differences between Data lake, Data Warehouse, and Database. Let's say…

    1 Comment
  • Festive Lights In Pandemic

    It's Diwali times the festival of lights in India. People around the world are recovering from an unimaginable…

Others also viewed

Explore content categories