An overview of databases

An overview of databases

Normally the choice of the database depends on a couple of parameters, whether it is very structured or non-structured data. The other factor that can make a difference in the choice of database is the query pattern and the next obvious one is the amount of scale.

Let's first take a look at some common ones and later in the upcoming article we will take a deeper dive into comparisons.

  • Caching - Let us say you are querying a database and you do not want to query the database a lot of times, you could cache the value in the cache. Alternatively, if you are making a remote call to different services which may have high latency, you might want to cache the response of that locally at your end in a caching solution. The caching solution is key-value stores. In our above case, the key would be the where clause or the query param or request param when you are making an API call. Very commonly used solutions are Redis, Memcached, ETCD, and Hazelcast.
  • Blob storage Say you are designing an e-commerce platform where you are having various products to sell. Now the sellers would be uploading product images and videos. and to store these data of images or videos kind, there we will use something called Blob storage. Now these are not really databases as fundamentally databases are meant to be queried upon and a file is not something that you normally query on. You just serve it as it is. One of the most common and fairly cost-effective ones is Amazon S3.
  • Text search engine - Let's continue with the above example of an e-commerce platform another common use case would be to have text search capability on the various products. Say you want to search for a product with an exact title or maybe some description, for all these capabilities you would be using something called Text Search Engine. A very common implementation of Text search Engine is provided by Elastic Search and Solr, again a very important thing about these is, they are not databases, these are Search engines. So you cannot consider these as a primary source of truth the primary data store should be somewhere else and you could load data in either of these systems to provide the searching capabilities.
  • Relational databases - This is the first database that comes to mind when we hear the database. In our e-commerce system, you can use relational databases to store structured data. The key support that comes with these databases is the ACID (atomicity, consistency, isolation, and durability) guaranteeMysql, Oracle, SQL Server, and Postgres are the common RDBMS.
  • Time series database - In a situation where you want to store some metrics kind of data, say you are building a system where different applications are pushing metrics related to their throughput, their CPU utilization, their latencies, etc. then is when comes something called as a Time series database, think of it as an extension of the relational database but with not all the functionalities and certain additional ones. In a metrics-based system, you would not want to perform random updates nor even random read, hence these databases are optimized to the queries that are kind of bulk read for a given time range and other similar query and input patterns. InfluxDB and OpenTSDB are a few common and popular time series databases.
  • Data Warehouse - Now the next use case is when we have a lot of information about a company in a certain kind of data store. Let's continue with our previous example of an e-commerce platform say you want to provide analytics on all the transactions here we would need a Data Warehouse that's basically a large dataset in which you can dump all the data and provide querying capabilities on top of the data to serve a lot of reports or maybe offline reporting. We can use something like Hadoop to fit the purpose.
  • Document databases - Say you are trying to build a catalog for the e-commerce platform which has information on all the items on the platform. Now each item would have certain attributes and not necessarily every item will have similar attributes hence it's a bit tricky on the standard relational databases to provide capabilities to store and query such data. In such cases, document databases like MongoDB, and Couchbase come to the rescue.
  • Columnar databases - These are the types of databases optimized for non-structured ever-increasing data where you want to perform a finite set of queries on the data set. Using our ongoing example of e-commerce you could use the columnar database to store orders as the orders data would be continuously increasing and you would definitely have a finite set of queries to run on the orders data set. Cassandra and HBase are few good examples of columnar databases.

So in a real-world scenario, you would have to use a combination of the above databases to fulfill the functional and non-functional requirements.

To view or add a comment, sign in

More articles by Kaushal Pahwani

  • How is data stored in SQL databases?

    Today we are going to take a look at how a SQL database stores the data, this understanding can be really helpful in…

  • A helper to choose the right database?

    We have situations when it gets tricky to choose between a relational and non-relational database. Today we will take a…

Others also viewed

Explore content categories