Cassandra is among the most popular NoSQL database systems since it provides:
- High Availability : Redundancy added through multiple nodes
- Linear scalability : multiple nodes can be added or removed as needed.
- Flexible data model: SQL like terminology with NoSQL like flexibility to support the unstructured data.
- High performance
- Strong consistency
Let’s see how Cassandra structures data in a top down level:
At the highest level, a single instance of Cassandra cluster is called a node which is running on a virtual or physical machine. A node stores the data and is responsible for communicating with other nodes in a cluster for high availability and data consistency.
At a lowest level, a column is the name/value and timestamp of the data and is the basic building block of the database.
A row consists of a rowkey (primary key) and a collection of columns, which can be different in different rows to provide flexibility. A group of rows is called a Column family.