Elasticsearch is a very popular opensource distributed Realtime search and analytics engine. In this article, I am going to give a small tip on migrating an Elasticsearch index or indices from one cluster to another cluster in a very easy and neat way.
You cannot back up an Elasticsearch cluster by simply copying the data directories of all of its nodes. Elasticsearch may be making changes to the contents of its data directories while it is running; copying its data directories cannot be expected to capture a consistent picture of their contents. If you try to restore a cluster from such a backup, it may fail and report corruption and/or missing files. Alternatively, it may appear to have succeeded though it silently lost some of its data. The only reliable way to back up a cluster is by using the snapshot and restore functionality.
We will be using the snapshot and restore feature of Elasticsearch to achieve this requirement. Both the Elasticsearch cluster needs to be running with the same version to perform this operation.
The high-level steps are given below. We have two options to perform this. The first approach is very easy if we have a shared directory mounted in both the cluster nodes.
Option 1: Using a share directory or NFS accessible from both the clusters.
If you register the same snapshot repository with multiple clusters, only one cluster should have write access to the repository. All other clusters connected to that repository should set the repository to readonly mode.
The snapshot format can change across major versions, so if you have clusters on different versions trying to write to the same repository, snapshots written by one version may not be visible to the other and the repository could be corrupted. While setting the repository to readonly on all but one of the clusters should work with multiple clusters differing by one major version, it is not a supported configuration.
- Register a snapshot in the source cluster (Read/Write)
- Register snapshot at the target cluster pointing to the same location (Read Only mode)
- Create a snapshot of the required indices from the source cluster
- Restore the the same snapshot at the target cluster
Option 2: By copy pasting the snapshots from the source cluster to target cluster
- Register a snapshot at the source cluster
- Create a snapshot of the required indices in the source cluster
- Copy the snapshot directory and its contents from the source cluster to a similar directory in the target cluster
- Register a snapshot in the target cluster pointing to the same snapshot location
- Restore the snapshot in the target cluster
If the source and target cluster has same indices, we need to rename the index while doing the restore.
The detailed steps of snapshot creation and restore and present in the Elasticsearch official documentation. The reference links are given below.