Elasticsearch: Understanding the basic architecture
Elasticsearch is a distributed, open-source search engine that is used for full-text search and analytics. It is designed to handle large amounts of data and provide fast and flexible search capabilities. The basic architecture of Elasticsearch consists of nodes, which are the basic building blocks of a cluster.
A node is a single instance of Elasticsearch that stores data and participates in the cluster's search and indexing capabilities. Nodes can be installed on a single machine or multiple machines, depending on the size and complexity of the data being indexed.
The nodes in Elasticsearch can be classified into two types: data nodes and master-eligible nodes.
Each node in Elasticsearch is assigned a unique name and can communicate with other nodes in the cluster over a network. Elasticsearch uses a discovery mechanism to find and join other nodes in the cluster. There are several discovery mechanisms available, such as unicast discovery, multicast discovery, and cloud discovery.
A cluster in Elasticsearch is a group of one or more nodes working together to store and manage data. When multiple nodes are connected and working together in a cluster, Elasticsearch automatically distributes data and load balances queries across all the nodes in the cluster.
Recommended by LinkedIn
Sharding is the process of breaking down a large index into smaller parts called shards, which can be distributed across multiple nodes in a cluster. Each shard is a self-contained index that can be stored and managed independently of other shards. By breaking an index into shards and distributing them across multiple nodes, Elasticsearch can handle large amounts of data and scale horizontally.
Elasticsearch can automatically balance the data across nodes in the cluster using the shard allocation feature. Each index is divided into multiple shards, and Elasticsearch can automatically distribute these shards across multiple nodes to ensure data availability and scalability.
In summary, the basic architecture of Elasticsearch with nodes involves multiple nodes (data and master-eligible nodes) working together in a cluster to store and manage data, with Elasticsearch automatically distributing the data across the nodes for scalability and reliability.