At work we are installing a Cassandra cluster for an additional tool in the database toolbox. Hadoop and Hive are bundled with it because we are using the DataStax distribution of Cassandra. This gives us a nice platform to store data and run Hadoop data mining jobs.
We will be using Cassandra several different ways:
- A database for information that does not need to be stored relationally.
- A caching server for data jobs that are run on the databases that we do not want direct web traffic to make calls against.
- A data mining platform using Hadoop and Hive/Pig.
We are in the process of installing and configuring the cluster now, so I’ll post again after we have the cluster up in our development environment.