Tuesday, 29 May 2012

When is Cassandra required for an application?

Cassandra is perfect for big data applications, and can be used in many different data management situations. Some of the most common use cases for Cassandra include:
  • Time series data management
  • High-velocity device data ingestion and analysis
  • Media streaming (e.g., music, movies)
  • Social media input and analysis
  • Online web retail (e.g., shopping carts, user transactions)
  • Web log management / analysis
  • Web click-stream analysis
  • Real-time data analytics
  • Online gaming (e.g., real-time messaging)
  • Write-intensive transaction systems
  • Buyer event analytics
  • Risk analysis and management

 Cassandra is typically not the choice for transactional data that needs per-transaction commit/rollback capabilities. Note that Cassandra does have atomic transactional abilities on a per row/insert basis (but with no rollback capabilities).

The primary difference between Cassandra and Hadoop is that Cassandra targets real-time/operational data, while Hadoop has been designed for batch-based analytic work.
There are many different technical differences between Cassandra and Hadoop, including Cassandra’s underlying data structure (based on Google’s BigTable); its fault-tolerant, peer-to-peer architecture; multi-data center capabilities; tunable data consistency; and much more – including the fact that all nodes in Cassandra are the same (e.g., no concept of a namenode).



No comments:

Post a Comment