Monday 4 June 2012

Cassandra and hadoop :which one is best??

There is a big confusion in every blog on which nosql out of HBase and cassandra is good.I think it depends on Use cases
  1. Data Model
    1. Cassandra
      1. You have a token ring. Each node takes a section of the Ring.
      2. Cassandra lets you decide between A Random Partitioner (hashing) and and Order Preserving Partitioner in order (by key). 
        1. If you use RandomPartitioner you can not range scan on keys (only the columns of a key) but you can use OOP and hash yourself.
    2. Hbase
      1. You have a regions. When regions grow large they split into sub regions.
      2. You only have the keys in byte order therefore If you are inserting based on timestamp one region gets overloaded.Randomize your inserts if they are timestamps"
      3. Practical people know you can hash yourself
      4. HBase has an advantage if you want to range scan on keys
      5. People know you most like will build your own secondary indexes as range scan on key is not going to give your every SQL feature you can wet dream of.
  2. Setup
    1. Cassandra
      1. Cassandra is very easy to setup and get working out of the box
      2. People know you are going to have to understand and tune anyway being able to boot up in seconds means not much. Just bringing up a node and letting it auto bootstrap will likely not be what you want
    2. HBase
      1. Hbase lives on top hadoop which is both a blessing and curse
      2. People know setting up hadoop is not easy
      3. People know hbase has many moving parts (nn,2nn,jt (for counts) datanodes task, trackers, zookeepers, hbase, masters
      4. People know its a bigger stack
  3. Cluster size
    1. Cassandra
      1. Cassandra is actually effective at small sizes 1, 3, 5 nodes
      2. You do not get 'decent' scale out  till ~7 nodes
      3. People say "cool I can grow into this even at a small shop"
    2. Hbase
      1. Just too much set up and too many components to make sense at less then ~7 nodes
      2. Many people are looking to solve big data problems so node number of nodes being greater then 7 is not an issue for them. They have some iron to throw at the problem.
  4. Scale out
    1. Cassandra
      1. Cassandra allows your to move nodes and have new nodes take sections of the ring
      2. People have noticed that you have to keep the cluster balanced which  DOES take administrative work
      3. People know moving from 11-15 nodes is intensive (compaction, stream, cleanup)
      4. People see joins sometime fail and have to do it again (annoyed)
    2. HBase
      1. Regions split and move quite frequent and automatically
      2. People see that they do not always move where you want
      3. People see that  automatic moving and splitting fails sometimes. 1/1000000 it might happens, and you may have to do surgery.
  5. Release cycle
    1. Cassandra
      1. The Cassandra code base moves pretty quickly and is pretty agile.
      2. People have waiting a while for critical features 'efficient moves'
      3. There have been some recent releases that were followed very quickly by a bug fix release
    2. Hbase
      1. The hbase code base moves pretty quickly and is pretty agile.
      2. people get worried by a blurring number of jiras per release
      3. People need a map to figure out the hadoop hbase version matrix and what works together. hadoop 0.20 append ?
  6. Consistency
    1. Cassandra
      1. Cassandra allows the user to chose consistency models
    2. Hbase
      1. Offers one consistency model
  7. API & RPC
    1. Cassandra
      1. Thrift and only thrift
    2. Hbase
      1. pure java client
  8. Administration
    1. Cassandra
      1. Admins are happy with one log
      2. Log is under control in terms of events
      3. Administration is more than just a log file and the wiki is skimpy
      4. Best practices are skimpy too.
      5. Schema design info is skimpy too.
    2. hbase & hadoop & zk & ...
      1. People see A LOT of log files and a lot of cryptic hadoop messages
      2. People see this actually takes a team of people!
  9. Facebook messaging & U
    1. Hbase
      1. Makes sense. Many nodes!
      2. Makes sense. Many hadoop/hbase committers!
      3. Makes sense. Really really really big scale!
      4. Haters point out 'still uses memcache db'
    2. Cassandra
      1. At the time did NOT have online schema updates (does now in 7)
      2. see section on scale out!
      3. Try to get them next go around!
  10. žTransactions
    Cassandra 
          Cassandra does not support transactions in the sense that it keeps bundling multiple row updates into one all-or-nothing operation.Nor does it roll back when a write succeeds on one replica, but fails on other replicas. It is possible in Cassandra to have a write operation report a failure to the client, but still actually persist the write to a replica.
    However, this does not mean that Cassandra cannot be used as an operational or real time data store. Data is very safe in Cassandra because writes in Cassandra are durable.žAll writes to a replica node are recorded both in memory and in a commit log before they are acknowledged as a success. If a crash or server failure occurs before the memory tables are flushed to disk, the commit log is replayed on restart to recover any lost writes.action commit/rollback capabilities.
           HBase   
    Tables in HBase are initially created with one region by default. For bulk imports, this means that all clients will write to the same region until it is large enough to split and become distributed across the cluster.

    In Cassandra, writing with a consistency level of ALL means that the data will be written to all N nodes responsible for the particular piece of data, where N is the replication factor, before the client gets a response.  In a standard Cassandra configuration, the write goes into an in-memory table and an in-memory log for each node.  The log is periodically batch flushed to disk; there is also an option to flush per commit, but this option severely impacts performance.  Subsequent reads from any node are strongly consistent and get the most recent update.

    In contrast, HBase has only one region server responsible for serving a given piece of data at any one time, and replication is handled on the HDFS layer.  A client sends an update to the region server currently responsible for the update key, and the region server responds with an ack as soon as it updates its in-memory data structure and flushes the update to its write-ahead commit log.  In older versions of HBase, the log was configured in a similar manner to Cassandra to flush periodically.  As a few commenters have pointed out, the default configuration of more recent versions of HBase flush the commit log before acknowledging writes to the client, using group commit to batch flushes across writes for performance.  Replication to the N HDFS nodes responsible for the written data still happens asynchronously, however.  HBase ensures strong consistency by routing subsequent reads through the same region server and, if a region server goes down, by using a system of locks based on ZooKeeper so that reads take into account the latest update.

    Because Cassandra writes data synchronously to all N nodes in this scheme whereas HBase writes data synchronously to only one node, Cassandra is necessarily slower.  In this scheme, write latency in Cassandra is essentially bottlenecked by the slowest machine and subject to variance in network speeds, IO speeds, and CPU loads across machines.  HBase pays a disk cost for its forced log sync, but in high throughput environments, group commit amortizes the disk cost across concurrent requests.

    The tradeoff comes in availability. Because only the write-ahead log has been replicated to the other HDFS nodes, if the region server that accepted the write fails, the ranges of data it was serving will be temporarily unavailable until a new server is assigned and the log is replayed. On the other hand, Cassandra will still have and serve the data (given the read level of ONE) even if N-1 nodes responsible for the data go down.
     
     
Let me cap this by saying both products are awesome and have many awesome people heading them up..THERE ARE MANY CASES WHERE I WOULD CHOSE ONE OR THE OTHER!

1 comment:

  1. great, and very entertaining, write-up!

    ReplyDelete