Pushpalanka's Blog

Posts

Showing posts from June, 2012

Bulk-Loading Data to Cassandra with sstable or JMX

June 29, 2012

The 'sstableloader' introduced from Apache Cassandra 0.8.1 onwards, provides a powerful way to load huge volumes of data into a Cassandra cluster. If you are moving from a cloud cluster to a dedicated cluster or vice-versa or from a different database to Cassandra you will be interested in this tool. As shown below in whatever case if you can generate the 'sstable' from the data to be loaded into Cassandra, you can load it in bulk to the cluster using 'sstableloader'. I have tried it in version 1.1.2 here. With this post I ll share my experience where I created sstables from a .csv file and loaded to a Cassandra instance running on same machine, which acts as the cluster here. sstable generation Bulk loading Cassandra using sstableloader Using JMX 'sstable' generation To generate 'SSTableSimpleUnsortedWriter' the 'cassandra.yaml' file should be present in the class path. In Intellij Idea you can d...

Running Cassandra in a Multi-node Cluster

June 11, 2012

This post gathers the steps I followed in setting up an Apache Cassandra cluster in multi-node. I have referred Cassandra wiki and Datastax documentation in setting up my cluster. The following procedure is expressed in details, sharing my experience in setting up the cluster. Setting up first node Adding other nodes Monitoring the cluster - nodetool , jConsole , Cassandra GUI I used Cassandra 1.1.0 and Cassandra GUI - cassandra-gui-0.8.0-beta1 version(As older release had problems in showing data) in Ubuntu OS. Setting up first node Open cassandra.yaml which is in 'apache-cassandra-1.1.0/conf'. Change listen_address: localhost --> listen_address: <node IP address> rpc_address: localhost --> rpc_address: <node IP address> - seeds: "127.0.0.1" --> - seeds: "node IP address" The listen address defines where the other nodes in the cluster should connect. So in a multi-node cluster it should to...