Implementing Db - Stable, reliable, resilient......

Our target is building always a stable, reliable, resilient, secure, redundant, and ready to scale network solutions (part of the strategy). Therefore, we want to share our team accomplishment of whom we are very proud of. Implementing a MariaDB Galera cluster solution, that comes with a lot of benefits for our projects under the close supervision of Marinko Mrksic and Bogdan Marinas, responsible for the technical implementation and setup of the DBs.

One important part to address, was keeping data safe secure and in case of disturbances in the network to avoid with all the costs losing any of it.

The challenge, started from the architecture, for which the team studied different solutions and sources. To address our requirements, it was chosen an open source solution that offers most of the benefits.

The candidate up for this task, was implementing a Galera cluster solution, considered a valid option for our project needs and fitting best the requirements. It had to be robust easily configured and be present in Prod and Geo (in different Data centers).

What is Galera cluster? is a synchronous multi-master replication plug-in for InnoDB. It is very different from the regular MySQL Replication, and addresses several issues including write conflicts when writing on multiple masters, replication lag and slaves being out of sync with the master. Users do not have to know which server they can write to (the master) and which servers they can read from (the slaves).


No alt text provided for this image

Fig.1 Replication differences between MySQL and Galera

An application can write to any node in a Galera cluster, and transaction commits (row-based replication events) are then applied on all servers, via a certification-based replication. Certification-based replication is an alternative approach to synchronous database replication using Group Communication and transaction ordering techniques.

The minimal Galera cluster solution is of 3 nodes and the recommendation to run with odd number of nodes. The reason is that, should there be a problem applying a transaction on one node (e.g., network problem or the machine becomes unresponsive), the two other nodes will have a quorum (i.e. a majority) and will be able to proceed with the transaction commit. Not to mention that having set up also the Geo-Red solution we have in total 6 nodes split in different availability zones (3 per each DC) and in case one node is down, the arbitrator is outside on the Prod and Geo, so in case one is impacted is easy to make the switch automatically without any manual intervention.

This plug-in is an open-source solution. There are 3 Galera solutions:

  • MySQL Galera Cluster by Codership
  • Percona XtraDB Cluster by Percona
  • MariaDB Galera Cluster (5.5 and 10.0) by MariaDB - option we used.

 

Galera Cluster replication implementation followed these 4 steps:

  • Database Management System - The database server that runs on the individual node. The supported DBMS, we used MariaDB Server.
  • wsrep API - The interface and the responsibilities for the database server and replication provider. It provides integration with the database server engine for write-set replication.
  • Galera Plugin - The plugin that enables the write-set replication service functionality.
  • Group Communication plugins - The various group communication systems available to Galera Cluster.

A database leveraged on Galera Cluster technology would need to incorporate the WriteSet Replication (wsrep) API patch into its database server codebase. Allow the Galera plugin which works as a wsrep provider to communicate and replicate transactions (write sets in Galera terms) via group communication protocol. This enables a synchronous master-master setup for InnoDB and transactions are synchronously committed on all nodes.

In case of a node failing, the other nodes will continue to operate and kept it up to date. When the failed node comes up again, it automatically synchronizes with the other nodes through State Snapshot Transfer (SST) or Incremental State Transfer (IST) depending on the last known state before it is allowed back into the cluster. No data is lost when a node fails.

Galera Cluster makes use of certification based replication (or more precisely, certification-based conflict resolution) is based on academic research, in particular on Fernando Pedone's Ph.D. thesis.), that is a form of synchronous replication with reduced overhead.

Initially, there was a different configuration implemented at DB level. We were having 2 clusters in each DC, and after tests and several issues, facing loss of data (any small disturbance was affecting the DBs, issues we had to solve), team decided to change the approach.

Studying and investigation more, adjusting to 3 nodes on each environment, Prod and Geo, instead of 2, + adding an arbitrator (which in case of failure does not affect the cluster operations as a new instance can be reattached to the cluster at any time and can be several arbitrators in the cluster).

Having 1 cluster which is going across both DCs. All DBs are connected to one commune cluster. All nodes are talking between each other.

As already anticipated in the sum-up, with MariaDB Galera cluster solution data is spread and in sync on all the DB nodes and shared across the 2 DCs under the guidance of an external arbitrator situated in a 3rd DC, used for Failover control redirecting the data traffic in case disturbances are experienced in the network. 

No alt text provided for this image

Fig.2 Our current configuration now.

Galera Cluster benefits

Galera Cluster has several benefits:

A high availability solution with synchronous replication, failover, and resynchronization

  • No loss of data
  • All servers have up-to-date data (no slave lag)
  • Read scalability
  • 'Pretty good' write scalability
  • High availability across data centers 

There are also some limitations:

  • It supports only InnoDB or XtraDB storage engine
  • With increasing number of writeable masters, the transaction rollback rate may increase, especially if there is write contention on the same dataset (a.k.a hotspot). This increases transaction latency.
  • It is possible for a slow/overloaded master node to affect performance of the Galera cluster. Therefore, it is recommended to have uniform servers across the cluster.


Tests have been successfully performed in both Prod and Geo-Red.

Monitoring also integrated via Prometheus that collects through NetData interface, present on each node. The errors from the nodes, are visualized in Grafana. Grafana is integrated with Slack and is sending notifications in a commune slack channel, where operation team is acting according to the processes and agreed parameters.

Galera cluster is a robust set-up that offers a great stability, reliability, fast recovery, and all that automatically, not to mention that are frequently tested on our Pan-Net cloud infrastructure. All this brings our team spirit closer, joining forces through collaboration driven to accomplish great results

Our team is part of Deutsche Telekom Pan-Net. We work together in the Data & ICT unit which is a part of the Service Delivery led by Angelo Giannattasio. We are a truly European team with different cultures that is result oriented and always looking for the best solution. We document, implement, test, and deliver. Starting from design, building the environments, on our own Pan-Net cloud platform, going from infrastructure to application implementation, and trying to overcome all the challenges along the way. 


Thanks for sharing! I'm wondering how far your DCs are in your Geo-Red setup since the last time we worked with galera, it was super sensitive to high TTLs

Like
Reply

Thanks Angelo for sharing the detailed writeup...its really sound good and if I will get chance to implement somewhere, would definitely implement architecture this kind of geo-red setup for DB.

Like
Reply

Hi Angelo, Thanks for sharing this interesting insight!

Like
Reply

To view or add a comment, sign in

More articles by Angelo Giannattasio

Others also viewed

Explore content categories