How to Choose a Database?

Related Articles

In the field of software development engineering managers often face the question of choosing the technology from the already existing and available on the market. And one of the most challenging decisions is the appropriate database. Programming languages are usually similar to each other. The question of language choice is a matter of taste. Choosing the wrong database can be disastrous for a project. We will discuss what aspects to pay attention to when selecting a DBMS.

Picking a database is a relatively long-term commitment for a business’s technical decision maker. When writing an application within a distributed system, all changes are captured in some sort of database. If you realize down the line that you’ve made the wrong choice, migrating to another database is a very costly and risky procedure. It is even more complex to do with zero downtime.

The Main Technologies

Remember that you don’t have to limit yourself to one technology. Ideally, each service should have its database, and you can choose the right technology. Try to find a compromise between using the wrong technology in all subsystems and creating a zoo of technologies. There’s no universal recipe, but it’s best to make a list of approved technologies for each of the following tasks:

  • relational database for OLTP (MySQL, Postgres);
  • relational database for OLAP (clickhouse);
  • cache (Redis, Memcache, tarantool);
  • key-value storages (Redis, Memcache, rocksdb, risk);
  • cluster solutions (Cassandra, MongoDB);

The list above is somewhat tentative. For example, tarantool is often used not as a cache but as a primary data source. MySQL can be a clustered solution. But this separation will at least give you a feel for the technology.

common data categories and usecases

Functional Features

Another critical factor in the choice will be a set of functional requirements for the DBMS. If you develop a full-text search system, you better not use Redis, MongoDB, etc., but specialized solutions like sphinx and elasticsearch. If you need a service that works with geodata, make sure that the database supports geographic indexes.

Keep in mind the possibility and ease of horizontal scaling of the repository. MongoDB, for example, is much easier to scale than MySQL. But scaling is not always needed. Don’t overcomplicate it. Take the simplest, most reliable solution if your project has a small target audience and won’t grow. This will save you time on startup.

Fault tolerance is also a crucial factor. All servers fail. And the more servers you have in your system, the more likely one of them will fail. Avoid creating single points of failure, at least in critical subsystems. Investigate the fault tolerance capabilities of your systems. Some solutions, such as Cassandra, are resilient to losing an entire data center. Keep in mind that no miracle will happen. First, run drills to make sure you’ve configured everything correctly. Second, the cost of fault tolerance will be the loss of speed.

Be sure to consider the reliability of the database. For example, MongoDB didn’t support ACID transactions until mid-2018. Evaluate if the reliability of this service is critical or if you still need more performance. For logging systems, speed is likely to be preferable. But using systems without ACID transactions for financial subsystems is not a good idea.

Other important criteria for choosing the right database technology for your service:

how to choose a database flowchart
  • Query Patterns: How complex are your query patterns? Do you just need retrieval by key, or also by various other parameters? Do you also need fuzzy search on the data?
  • Consistency: Is strong consistency required (read after write, especially when you switch writes to a different data-center) or eventual consistency is OK?
  • Storage Capacity: How much storage capacity is needed?
  • Performance: What is the needed throughput and latency?
  • Maturity and Stability: If you choose self-hosted deployment, How much experience does your DBA team have with this technology, how mature is it?
  • Cost: If you choose a managed cloud solution, What are the costs? What are its limitations?

Last but not least, consider the expertise of your team. Learning new technology always takes time, and without experience, people will make typical mistakes. Often the best solution is to use the most familiar technology.

What's Trending in Your Area

HomeMoneyTechHow to Choose a Database?