Challenges with traditional databases and selecting the right NoSQL solution.

NoSQL

courtesy: Simon Jakowicz

Relational databases (RDBMS) have long been a cornerstone in the computing industry, but like everything else, change is the only constant.

Challenges with traditional databases

  • Not a good fit for large data volume with varying elements e.g. images and videos.

  • Vertical scaling is feasible only up to a certain extent.

  • Sharding causes operational problems e.g. managing shard failures.

  • Consistency (satisfying ACID requirements) has been a bottleneck for scalability in RDBMS.

CAP

CAP theorem

Benefits of NoSQL

  • NoSQL is a BaSE system.
    • Basically Available: The system does not guarantee availability in accordance with the CAP theorem.
    • Soft State: The state of the system may change over time, even without an input.
    • Eventual Consistency: The system will become consistent over time, provided that the system does not receive any input during that period.
  • A non-locking concurrency control mechanism so that the real time reads will not conflict with the writes.

  • Scalable replication and distribution.

So now we have SQL vs NoSQL differences cleared out, it’s time to dig in some of the most popular options and their use cases.

I. Document Store

Most popular, most diverse. Best suited for applications with varied data requirements. Instead of storing data in different tables, data that is frequently queried together is stored together in the same document.

MongoDB

MongoDB: The poster child of NoSQL

When to use them?

  • Large websites with a high volume of reads and writes.

  • Real-time analysis and high speed logging.

  • Caching and high scalability.

Some of the big names using MongoDB include Sony, Udacity, IBM, HTC and Foursquare.

II. Column Store

Column type NoSQL boasts high availability. They are best suited for high velocity random reads and writes. They have flexible sparse/wide column requirements. Column types run on clusters of multiple servers and are perfect for gigantic applications.

Cassandra

Cassandra

When to use them?

  • Applications that always require frequent writes.

  • Applications that are geographically distributed over multiple data centers.

  • Applications that are really huge (>100 TB) and can tolerate short term inconsistency.

Some of the big names using Cassandra include CERN, Netflix and Facebook.

III. Key-Value Store

Key-Value stores are well suited for frequent but smaller reads and writes. They have a relatively simple query structure compared to other three types.

Redis

Redis

When to use them?

  • Caching data from RDBMS to improve performance.

  • Tracking transient attributes in a bigger application e.g. shopping cart.

  • Storing configuration and user information for mobile applications.

Some of the big names using Redis are GitHub, StackOverflow and Pinterest.

IV. Graph Store

Graph stores are well suited for leveraging data relationships that store relationship information as a first-class entity. They are usually built on top of column stores to provide richer functions e.g. Facebook.

Neo4j

Neo4j

When to use them?

  • Network and IT infrastructure management.

  • Identity and access management.

  • Recommendation systems and social networks.

Furthermore

Hope this brief introduction will help you select the right NoSQL solution. Here are some additional resources:

comments powered by Disqus