Serialization & Deserialization for Backend Engineers

Serialization & Deserialization for Backend Engineers


>>> Why This Topic Actually Matters

If you strip backend engineering to its core, everything revolves around data movement:

  1. Client → Server
  2. Server → Database
  3. Microservice → Microservice
  4. Disk → Memory

Raw in-memory data structures (objects, classes, graphs) cannot travel over networks or be stored directly. They must first be converted into a transferable format.

That transformation is called:

  1. Serialization → Converting data into a storable/transmittable format
  2. Deserialization → Reconstructing the original data from that format

Without these, APIs, databases, caching systems, message queues, and distributed systems would collapse.


>>> OSI Model: Where Serialization Fits

What is the OSI Model?

The OSI (Open Systems Interconnection) Model is a conceptual framework that explains how communication happens between systems.

7 Layers Overview

Article content
7 layers model

Critical Insight

Serialization lives in Layer 6 (Presentation Layer)

That layer is responsible for:

  1. Data formatting
  2. Encoding
  3. Compression
  4. Encryption

Translation: Before data is sent over HTTP (Layer 7), it must be serialized into a format like JSON or binary.


>>> What is Serialization?

Definition

Serialization is the process of converting an in-memory object into a format that can be stored or transmitted.

Simple Analogy

Think of serialization like:

Packing your clothes (object) into a suitcase (JSON/binary format) so they can travel.

Example: Serialization in Python

Original Object

user = {
    "id": 101,
    "name": "Mohit",
    "isActive": True
}
        

Serialized (JSON)

{
  "id": 101,
  "name": "Mohit",
  "isActive": true
}
        

Python Code

import json

user = {
    "id": 101,
    "name": "Mohit",
    "isActive": True
}

serialized = json.dumps(user)
print(serialized)
        

>>> Serialization Standards

Serialization is not random. There are standards (formats) that define how data is structured.

These fall into two major categories:

1) Text-Based Serialization

Characteristics

  1. Human-readable
  2. Easy debugging
  3. Larger size
  4. Slower processing

Common Formats

1. JSON (Most Important)

2. XML

3. YAML

Example: JSON

{
  "name": "Mohit",
  "skills": ["Python", "Backend"],
  "experience": 2
}
        

Example: XML

<user>
  <name>Mohit</name>
  <skills>Python</skills>
</user>
        

When to Use Text-Based Formats

  1. REST APIs
  2. Debugging-heavy systems
  3. Public APIs
  4. Web applications

2) Binary Serialization

Characteristics

  1. Not human-readable
  2. Compact size
  3. Faster
  4. More efficient

Common Formats

  1. Protocol Buffers (Protobuf)
  2. Avro
  3. MessagePack
  4. Thrift

Example: Protobuf Concept

Instead of sending:

{ "id": 101, "name": "Mohit" }
        

It sends a compressed binary stream like:

0x08 0x65 0x12 0x06 4D 6F 68 69 74
        

Smaller payload → faster network transfer → lower cost

When to Use Binary Formats

  1. Microservices communication
  2. High-performance systems
  3. Streaming pipelines
  4. Large-scale distributed systems


>>> What is Deserialization?

Definition

Deserialization is the process of converting serialized data back into its original object form.

Example in Python

import json

data = '{"id": 101, "name": "Mohit"}'

obj = json.loads(data)
print(obj["name"])
        

Real Flow

Client sends JSON → Server deserializes → Processes → Serializes → Sends back
        

>>> Why Serialization & Deserialization Are Used

1. Network Communication

APIs cannot send Python objects or Java objects directly.

They must send:

  1. JSON
  2. XML
  3. Binary

2. Data Storage

Databases store serialized data:

  1. JSON columns (PostgreSQL, MongoDB)
  2. Binary blobs

3. Caching

Systems like Redis store serialized objects.

4. Message Queues

Kafka, RabbitMQ use serialization for:

  1. Event streaming
  2. Distributed systems

5. Language Interoperability

A Python service can talk to a Java service using JSON or Protobuf.


>>> JSON in Detail (Core Backend Weapon)

What is JSON?

JSON (JavaScript Object Notation) is a lightweight text-based data format used for data exchange.

Key Characteristics

  1. Human-readable
  2. Language-independent
  3. Key-value structure
  4. Widely supported

JSON Data Types

{
  "string_example": {
    "name": "Mohit"
  },
  "number_example": {
    "age": 21
  },
  "boolean_example": {
    "active": true
  },
  "null_example": {
    "data": null
  },
  "object_example": {
    "user": {
      "key": "value"
    }
  },
  "array_example": {
    "numbers": [1, 2, 3]
  }
}        

Complex JSON Example

{
  "user": {
    "id": 101,
    "name": "Mohit",
    "skills": ["Python", "Node.js"],
    "education": {
      "degree": "B.Tech",
      "year": 2027
    }
  }
}
        

JSON Serialization in Different Languages

Python

json.dumps(data)
json.loads(json_string)
        

JavaScript

JSON.stringify(obj)
JSON.parse(jsonString)
        

Java (Jackson)

ObjectMapper mapper = new ObjectMapper();
String json = mapper.writeValueAsString(obj);
        

>>> JSON vs Binary: Brutal Comparison

  1. Readability: JSON → High (human-readable), Binary → None (not human-readable)
  2. Size: JSON → Larger payload, Binary → Smaller and compact
  3. Speed: JSON → Moderate processing speed, Binary → High performance
  4. Debugging: JSON → Easy to inspect, Binary → Difficult to debug
  5. Use Case: JSON → APIs and web apps, Binary → Microservices and high-performance systems


>>> Common Mistakes Beginners Make

Mistake 1: Assuming JSON supports everything

It does NOT support:

  1. Functions
  2. Dates (needs string conversion)
  3. Custom objects

Mistake 2: Ignoring Schema Validation

Always validate JSON:

  1. Use JSON Schema
  2. Avoid runtime crashes

Mistake 3: Overusing JSON in High-Performance Systems

If you are building:

  1. Real-time systems
  2. High-throughput pipelines

Switch to Protobuf or MessagePack


>>> Advanced Insight: Serialization Trade-offs

You must choose based on:

  • Latency requirements
  • Bandwidth cost
  • Debuggability
  • Scalability

Strategic Thinking

  • Startup MVP → JSON
  • Scaling system → Hybrid (JSON + Binary)
  • Enterprise infra → Binary-first


>>> Real Backend Flow Example

API Request Lifecycle

  1. Client sends JSON
  2. Server deserializes → Object
  3. Business logic executes
  4. Data fetched from DB
  5. Response serialized → JSON
  6. Sent back to client


>>> Final Takeaways

  • Serialization is non-negotiable in backend systems
  • JSON is your entry weapon, not your final weapon
  • Binary formats dominate at scale
  • OSI Layer 6 is where this transformation lives
  • Mastering this unlocks APIs, microservices, distributed systems

You now have enough depth to not just use serialization, but to design systems around it.


To view or add a comment, sign in

More articles by Mohit Kumar

  • Backend Routing

    If you are building any web application, routing is one of the most important concepts you must understand. Whether you…

  • HTTPS (REQUEST RESPONSE CYCLE)

    The Complete HTTP & Networking Playbook for Developers Targeting high paying Backend Job >>> Statelessness: The…

  • Understanding Leading Underscores (_) in Python: Convention, Not Enforcement

    How Python signals internal APIs without enforcing access control When i read production-grade Python code, I saw…

Explore content categories