The River of Data

The River of Data

⭐ Story: “The River of Data – The Tech Journey from the 90s to Now”

(with real use-cases + technology evolution)

Imagine data as water in a river.

Over three decades, this river became faster, smarter, and more connected — powered by the evolution of APIs, communication protocols, and streaming platforms.

Let’s travel through time.


🕰️ 📀 1990s – The Batch & RPC Era (Data flows slowly)

In the 90s, companies didn’t care about real-time. Systems spoke to each other occasionally — like sending letters by post.

How data moved

  • Files transferred via FTP
  • Batch jobs run using Cron
  • APIs were rare; systems communicated using: RPC (Remote Procedure Call) CORBA DCOM
  • Data sharing was heavy, rigid, and synchronous.

Tech of the 90s

✔ FTP, Telnet ✔ COBOL & Mainframes ✔ Cron Jobs ✔ RPC, CORBA, DCOM ✔ Early client–server TCP sockets ✔ SQL / Oracle batch loads

Use Case

🛒 Retail daily sales transfer Stores created daily files and batch-uploaded them to HO at night.

The river was a bucket delivered once per day.


📟 2000s – The Web Era (SOAP, XML & Early Streaming)

The internet exploded → applications needed structured communication.

How data moved

  • Logs were generated continuously
  • “Near real-time” concepts started
  • Systems communicated using: SOAP APIs XML-RPC WSDL-based services

These were bulky, strict, and slow — but they enabled cross-company integration.

Tech of the 2000s

✔ SOAP, WSDL, XML-RPC ✔ JMS, IBM MQ, ActiveMQ ✔ Syslog live logs ✔ Early Publish/Subscribe ✔ AJAX (birth of live web)

Use Cases

🌐 Website traffic tracking Companies monitored visits in near real-time using log streams.

The river began to flow continuously, but slowly.


📲 2010s – Real-Time Era (REST, WebSockets, Kafka, Spark)

Smartphones, social media, IoT → real-time became mandatory.

How data moved

  • Events generated every millisecond
  • Real-time pipelines established
  • APIs became: REST (JSON over HTTP) WebSockets for two-way communication MQTT for IoT

Streaming systems matured:

  • Kafka (2011)
  • Storm
  • Spark Streaming

Tech of the 2010s

✔ REST APIs (dominant) ✔ WebSockets (chat, trading, gaming) ✔ MQTT (IoT lightweight publish/subscribe) ✔ Apache Kafka (real-time backbone) ✔ Spark Streaming ✔ Cassandra / HBase (big data stores) ✔ Microservices boom — each service emits events

Use Cases

💳 Real-time fraud detection Banks detected anomalies instantly.

🛵 Ride-hailing apps (Uber, Ola) Driver → app location updates every few seconds.

Now the river is a fast-flowing network of channels.


⚡ 2020s – Ultra Real-Time Era (gRPC, Flink, Pulsar, Cloud Streams)

Data is now a live organism flowing through ecosystems.

How data moves

  • Billions of events per second
  • Ultra-fast binary communication
  • Event-driven microservices
  • AI models consume streams directly

API & communication evolution

  • gRPC (Google’s high-speed protocol) Binary format (Protobuf) Low latency → great for microservices
  • GraphQL for optimized data queries
  • Server-Sent Events (SSE)
  • Event-driven systems (Kafka, Pulsar)

Streaming evolution

  • Event-time processing
  • Stateful stream processing
  • Exactly-once guarantees

Tech of the 2020s

gRPC (microservices communication) ✔ GraphQLApache Flink (industry standard for streaming) ✔ Kafka Streams / ksqlDBApache PulsarAWS Kinesis, Azure EventHub, GCP Pub/Sub Redis StreamsEdge computing + MQTT 5

Use Cases

🚗 Connected Cars (Tesla) Vehicles stream telemetry every millisecond.

📹 Real-time video analytics Traffic, retail stores, surveillance systems.

📈 Algorithmic stock trading Microsecond-level event processing.

Now the river becomes a massive, interconnected real-time ocean.


🤖 Future – AI-Native Streaming (LLM Event Streams, Digital Twins)

Data streaming merges with AI.

How data will move

  • LLMs processing events live
  • Vector DBs ingest streaming embeddings
  • IoT sending real-time digital-twin updates
  • AI agents reacting automatically

Future Tech

✔ Real-time RAG pipelines ✔ LLM Observability Streams ✔ Autonomous event routing ✔ Edge AI stream processing

Use Case

🏙️ AI-managed Smart Cities AI adjusts traffic, pollution, power dynamically based on streaming data.

The river becomes self-regulating and intelligent.


⭐ Summary Table (Tech + Era + Use Cases)

Article content


To view or add a comment, sign in

More articles by Rohit Beohar

  • 3G-UMTS-WCDMA

    📡 The Evolution of Telecom Technology — And Why 3G WCDMA Was a Turning Point The telecom industry has undergone a…

  • Silo vs Multi-Tenant Architecture

    🚀 Silo vs Multi-Tenant Architecture – A TPM’s Practical Guide As a Technical Product Manager (TPM) working on SaaS…

  • The Invisible Hero of Telecom

    📡 The Invisible Hero of Telecom: A Story of OSS 🌆 A City That Never Sleeps It was 2:00 AM. The city was asleep—but…

  • Complete Machine Learning Lifecycle

    🚀 From Idea to Production: Complete Machine Learning Lifecycle (Simple & Practical Guide) Machine Learning is not just…

  • What Is Agentic AI ?

    Agentic AI isn’t about smarter replies. It’s about building systems that can think, act, and adapt to get real work…

  • Discovery in AI Product Development

    📌 What Is Discovery in AI Product Development? Discovery is the structured process of understanding the right problem…

  • 🤖 AI vs ML — A Product Manager’s Story

    Why every PM must understand the difference The Story of Roy and the Smart City App Roy is a Product Manager at a…

    1 Comment
  • Rohit and the Quantum Mango Tree

    🌟 Story: Rohit and the Quantum Mango Tree Rohit was a curious 12-year-old boy from Jabalpur. One day, his uncle, a…

  • 🌟 “Rohit and the Magical Post Office of Neurons”

    (A simple story to explain how Neural Networks work) In the city of Techpur, there lived a curious boy named Rohit. One…

  • The Three Ways Machines Learn: A Magical Tale

    🌟 The Story of Machine Learning: How Rohit Helped Meera’s Sweet Shop Become Smart In a busy little Indian town called…

Explore content categories