Elasticsearch: Understanding the basic architecture

Elasticsearch: Understanding the basic architecture

Elasticsearch is a distributed, open-source search engine that is used for full-text search and analytics. It is designed to handle large amounts of data and provide fast and flexible search capabilities. The basic architecture of Elasticsearch consists of nodes, which are the basic building blocks of a cluster.

A node is a single instance of Elasticsearch that stores data and participates in the cluster's search and indexing capabilities. Nodes can be installed on a single machine or multiple machines, depending on the size and complexity of the data being indexed.

The nodes in Elasticsearch can be classified into two types: data nodes and master-eligible nodes.

  1. Data Nodes: These nodes store data and perform data-related operations such as indexing, searching, and aggregations. Data nodes hold the primary and replica shards of an index.
  2. Master-Eligible Nodes: These nodes perform cluster management tasks such as creating or deleting indices, assigning shards to nodes, and monitoring the health of the cluster. Master-eligible nodes also participate in the election of a new master node in the event of a failure.

Each node in Elasticsearch is assigned a unique name and can communicate with other nodes in the cluster over a network. Elasticsearch uses a discovery mechanism to find and join other nodes in the cluster. There are several discovery mechanisms available, such as unicast discovery, multicast discovery, and cloud discovery.

A cluster in Elasticsearch is a group of one or more nodes working together to store and manage data. When multiple nodes are connected and working together in a cluster, Elasticsearch automatically distributes data and load balances queries across all the nodes in the cluster.

Sharding is the process of breaking down a large index into smaller parts called shards, which can be distributed across multiple nodes in a cluster. Each shard is a self-contained index that can be stored and managed independently of other shards. By breaking an index into shards and distributing them across multiple nodes, Elasticsearch can handle large amounts of data and scale horizontally.


No alt text provided for this image

Elasticsearch can automatically balance the data across nodes in the cluster using the shard allocation feature. Each index is divided into multiple shards, and Elasticsearch can automatically distribute these shards across multiple nodes to ensure data availability and scalability.

No alt text provided for this image
Relation between RDBMS and Elasticsearch terms

In summary, the basic architecture of Elasticsearch with nodes involves multiple nodes (data and master-eligible nodes) working together in a cluster to store and manage data, with Elasticsearch automatically distributing the data across the nodes for scalability and reliability.

To view or add a comment, sign in

More articles by Jeevan George John

Others also viewed

Explore content categories