At work we are installing a Cassandra cluster for an additional tool in the database toolbox. Hadoop and Hive are bundled with it because we are using the DataStax distribution of Cassandra. This gives us a nice platform to store data and run Hadoop data mining jobs.
We will be using Cassandra several different ways:
- A database for information that does not need to be stored relationally.
- A caching server for data jobs that are run on the databases that we do not want direct web traffic to make calls against.
- A data mining platform using Hadoop and Hive/Pig.
Being a .Net shop, we chose Fluent Cassandra as the Cassandra client library, this was after a healthy Fluent Cassandra vs Aquiles debate.
We are in the process of installing and configuring the cluster now, so I’ll post again after we have the cluster up in our development environment.
Pingback: Cassandra and Hadoop and Hive, Oh my! (part 2) - blog.robert.mcfrazier.com