Challenges with traditional databases and selecting the right NoSQL solution.
courtesy: Simon Jakowicz
Relational databases (RDBMS) have long been a cornerstone in the computing industry, but like everything else, change is the only constant.
Challenges with traditional databases
-
Not a good fit for large data volume with varying elements e.g. images and videos.
-
Vertical scaling is feasible only up to a certain extent.
-
Sharding causes operational problems e.g. managing shard failures.
-
Consistency (satisfying ACID requirements) has been a bottleneck for scalability in RDBMS.
CAP theorem
Benefits of NoSQL
- NoSQL is a BaSE system.
- Basically Available: The system does not guarantee availability in accordance with the CAP theorem.
- Soft State: The state of the system may change over time, even without an input.
- Eventual Consistency: The system will become consistent over time, provided that the system does not receive any input during that period.
-
A non-locking concurrency control mechanism so that the real time reads will not conflict with the writes.
- Scalable replication and distribution.
So now we have SQL vs NoSQL differences cleared out, it’s time to dig in some of the most popular options and their use cases.
I. Document Store
Most popular, most diverse. Best suited for applications with varied data requirements. Instead of storing data in different tables, data that is frequently queried together is stored together in the same document.
MongoDB: The poster child of NoSQL
When to use them?
-
Large websites with a high volume of reads and writes.
-
Real-time analysis and high speed logging.
-
Caching and high scalability.
Some of the big names using MongoDB include Sony, Udacity, IBM, HTC and Foursquare.
II. Column Store
Column type NoSQL boasts high availability. They are best suited for high velocity random reads and writes. They have flexible sparse/wide column requirements. Column types run on clusters of multiple servers and are perfect for gigantic applications.
Cassandra
When to use them?
-
Applications that always require frequent writes.
-
Applications that are geographically distributed over multiple data centers.
-
Applications that are really huge (>100 TB) and can tolerate short term inconsistency.
Some of the big names using Cassandra include CERN, Netflix and Facebook.
III. Key-Value Store
Key-Value stores are well suited for frequent but smaller reads and writes. They have a relatively simple query structure compared to other three types.
Redis
When to use them?
-
Caching data from RDBMS to improve performance.
-
Tracking transient attributes in a bigger application e.g. shopping cart.
-
Storing configuration and user information for mobile applications.
Some of the big names using Redis are GitHub, StackOverflow and Pinterest.
IV. Graph Store
Graph stores are well suited for leveraging data relationships that store relationship information as a first-class entity. They are usually built on top of column stores to provide richer functions e.g. Facebook.
Neo4j
When to use them?
-
Network and IT infrastructure management.
-
Identity and access management.
-
Recommendation systems and social networks.
Furthermore
Hope this brief introduction will help you select the right NoSQL solution. Here are some additional resources: