Cassandra is (also known as Apache Cassandra) an open-source distributed database management system. It was developed to handle large amounts of data across commodity hardware or cloud infrastructure. Cassandra provides high availability with no single point of failure.
Cassandra supports linear scalability by adding a new machine to it with no downtime or interruption to applications, also increases Read and Write throughput of the Cassandra.
Every Cassandra node in the cluster will have the same role. Data is distributed across the cluster which means each node holds different data. Cassandra supports replication and multi-datacenter replication for redundancy, failover, and disaster recovery.
Apache Cassandra requires Java to be installed on the server. You can either install Oracle Java or OpenJDK for this installation.
Here, I will use OpenJDK 8.
sudo apt update sudo apt install -y openjdk-8-jre
Verify the version of Java.
You should get a similar output as below.
openjdk version "1.8.0_212" OpenJDK Runtime Environment (build 1.8.0_212-8u212-b03-0ubuntu1.18.04.1-b03) OpenJDK 64-Bit Server VM (build 25.212-b03, mixed mode)
We will install Cassandra using official package available on Apache Software Foundation.
Add the public key for Cassandra repo so that you won’t encounter GPG error.
curl https://www.apache.org/dist/cassandra/KEYS | sudo apt-key add -
Add Cassandra repository to your system with the below command.
echo "deb http://www.apache.org/dist/cassandra/debian 311x main" | sudo tee -a /etc/apt/sources.list.d/cassandra.sources.list
Install Apache Cassandra
Update the repositories.
sudo apt update
sudo apt install -y cassandra
Cassandra’s configuration files are found in /etc/cassandra, log and data are stored in /var/log/cassandra/ and /var/lib/cassandra respectively.
Verify Cassandra is running.
sudo service cassandra status
You might get similar output like below.
● cassandra.service - LSB: distributed storage system for structured data Loaded: loaded (/etc/init.d/cassandra; generated) Active: active (running) since Tue 2019-07-02 11:04:51 UTC; 1min 30s ago Docs: man:systemd-sysv-generator(8) Tasks: 39 (limit: 4401) CGroup: /system.slice/cassandra.service └─7679 java -Xloggc:/var/log/cassandra/gc.log -ea -XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=42 - Jul 02 11:04:51 cas systemd: Starting LSB: distributed storage system for structured data... Jul 02 11:04:51 cas systemd: Started LSB: distributed storage system for structured data.
Verify Apache Cassandra Cluster
If your previous command came with the expected output, you could verify the Cassandra cluster by executing the below command.
sudo nodetool status
Below output confirms the cluster is up and running.
Datacenter: datacenter1 ======================= Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns (effective) Host ID Rack UN 127.0.0.1 103.67 KiB 256 100.0% 7d9d568b-5287-407a-82ea-2498bd967656 rack1
U – Cluster is UP
N – Cluster is Normal
Connect to Cassandra cluster using its command line interface cqlsh
You will now connect to the cluster.
Connected to Test Cluster at 127.0.0.1:9042. [cqlsh 5.0.1 | Cassandra 3.11.4 | CQL spec 3.4.4 | Native protocol v4] Use HELP for help. cqlsh>