First, remember that in Cassandra terminology, “subcolumn” = “supercolumn” = “sub column” = “supercolumn”.
With that in mind, a “super column family” is really just a “column family…that contains super columns under its rows”. (As opposed to a regular “column family” that merely contains rows without supercolumns.)
The confusion comes about because “super column family” entries look like this:
..and plain old “column family” entries look like this:
…both use a tag named “ColumnFamily” in Cassandra’s “storage-conf.xml” definition file.
Personally, I prefer using the term “Column Family” to cover both column families with rows that contain supercolumns as well as column families with rows that don’t contain supercolumns. But if someone uses the term “super column family” they always mean “a column family that contains rows that contain supercolumns.”
This article covers the difference between a supercolumn and a subcolumn in Cassandra.
Let me cut to the chase: there is no difference. They are two terms for exactly the same thing.
If you are familiar with a typical keystore->column family->row->super column->column structure, such as the one pictured below, then you could safely replace all instances of the phrase “super column” with “subcolumn” without changing the meaning.
The confusion around “super column” vs. “sub column” is fueled largely by the Cassandra configuration file. In your “storage-conf.xml” file you will see XML “ColumnFamily” configuration elements like this:
If this was was a plain old “ColumnFamily” entry, you would only see this:
…but this is a “Super Column Family”, so there are two extra attributes:
“super column = sub column = supercolumn = subcolumn…”
With that in mind, a “super column family” is really just a “column family…that contains super columns under its rows”. (As opposed to a regular “column family” that merely contains rows without supercolumns.)
The confusion comes about because “super column family” entries look like this:
1
2
3
4
| < ColumnFamily Name = "Super1" ColumnType = "Super" CompareWith = "BytesType" CompareSubcolumnsWith = "BytesType" /> |
1
2
| < ColumnFamily Name = "Regular1" CompareWith = "BytesType" /> |
Personally, I prefer using the term “Column Family” to cover both column families with rows that contain supercolumns as well as column families with rows that don’t contain supercolumns. But if someone uses the term “super column family” they always mean “a column family that contains rows that contain supercolumns.”
Let me cut to the chase: there is no difference. They are two terms for exactly the same thing.
If you are familiar with a typical keystore->column family->row->super column->column structure, such as the one pictured below, then you could safely replace all instances of the phrase “super column” with “subcolumn” without changing the meaning.
The confusion around “super column” vs. “sub column” is fueled largely by the Cassandra configuration file. In your “storage-conf.xml” file you will see XML “ColumnFamily” configuration elements like this:
1
2
3
4
| < ColumnFamily Name = "Super1" ColumnType = "Super" CompareWith = "BytesType" CompareSubcolumnsWith = "BytesType" /> |
1
2
| < ColumnFamily Name = "Regular1" CompareWith = "BytesType" /> |
- ColumnType=”Super” to tell Cassandra that this column family will contain super columns.
- CompareSubcolumnsWith=”BytesType” to tell Cassandra that our sub columns will be sorted through bit-by-bit comparison.
“super column = sub column = supercolumn = subcolumn…”
No comments:
Post a Comment