Replacing the down node

To replace a down/unresponsive node with a new node (different address) in the cluster, prepare the replacement node configuration to match other cluster nodes. Then, start the replacement node by specifying the dead node to replace with.

Prepare the replacement node by configuring it with the same cluster name, seed nodes, and data center name as other nodes in the cluster.

Add the following parameter to JVM_OPTS in cassandra-env.sh (default location: /etc/cassandra/conf) for the replacement node:

-Dcassandra.replace_address_first_boot=<dead_node_ip>
CODE

This parameter will drop the dead node and add the new node to the cluster upon startup. Remove the parameter after successful startup and replacement node joins the cluster.

To monitor the state and progress of replacement operations (including move and remove), use this command to display the progress of streaming operations:

nodetool netstats
BASH

Dropping node

Before dropping a node from a live cluster, make sure there are enough live nodes in the cluster to maintain replication factor. If dropping a node that is down, refer to [Replace Down Node] procedure.

To drop a live node, run the following command while on the node:

nodetool decommission
BASH

To drop a dead/offline node, run the following command:

nodetool removenode <Host ID>
BASH

If the dropped node was a seed node, update cassandra.yaml and reload seeds on all live nodes by executing the following command:

nodetool reloadseeds
BASH

Otherwise, Teamwork Cloud service may not start.

Modifying the seed node

If a seed node was added, removed, or replaced from a cluster, the seed node list will need to be updated.

Use the following command to check what seeds are currently configured for the cluster:

nodetool getseeds
BASH

Update the seeds list configuration, under seed_provider of cassandra.yaml, for every node in the cluster. Use the following command to apply the seed list change on a live node.

nodetool reloadseeds
BASH

Otherwise, the changes will not be applied until each node is restarted.

Syncing nodes

The Cassandra repair tool is used to fix data inconsistencies between nodes. If a down node was replaced, repair will stream differences in the data from out-of-sync data between nodes.

For incremental repair of the current node, use the command:

nodetool repair
BASH

Incremental repairs should be run regularly on each node. Apache recommends doing this every 7 days.

For a full repair of all data, use the command:

nodetool repair --full
BASH

 A new node to the cluster should bootstrap and sync data from other nodes. If a new node has not streamed data from other nodes on the cluster, use the following command to initiate the process:

nodetool rebuild
BASH


For additional information, refer to the official Cassandra documentation.