Undeleting and purging KeyTrustee Key Provider methods via the REST interface

HDFS Data encryption is an excellent feature that came recently. With this we can encrypt the data in hdfs. We can create multiple encryption zones with different encryption keys. In this way, we can secure the data in hdfs properly. For more details, you can visit these websites. Reference1, Reference2

I am using a cluster installed with CDH. I created some encryption keys and zones.
The command I used for creating a key is given below.

# As the normal user, create a new encryption key
hadoop key create amalKey
 

# As the super user, create a new empty directory and make it an encryption zone
hadoop fs -mkdir /user/amal
hdfs crypto -createZone -keyName amalKey -path /user/amal
 

# chown it to the normal user
hadoop fs -chown amal:hadoop /user/amal
 

# As the normal user, put a file in, read it out
hadoop fs -put test.txt /user/amal/
hadoop fs -cat /user/amal/test.tx
 

After some days, I deleted the encryption zone and I deleted the encryption key also.
The command I used for deleting the encryption key is given below

hadoop key delete <key-name>

After the deletion, I tried creating the key with the same name. But I got an exception that the key is still present in the disabled state. When I list the keys, I am not able to see the key. The exception that I got was given below.

amalKey has not been created. java.io.IOException: HTTP status [500], exception [com.cloudera.keytrustee.TrusteeKeyProvider$DuplicateKeyException], message [Key with name "amalKey" already exists in "com.cloudera.keytrustee.TrusteeKeyProvider@6d88562. Key exists but has been disabled. Use undelete to enable.]
java.io.IOException: HTTP status [500], exception [com.cloudera.keytrustee.TrusteeKeyProvider$DuplicateKeyException], message [Key with name "amalKey" already exists in "com.cloudera.keytrustee.TrusteeKeyProvider@6d88562. Key exists but has been disabled. Use undelete to enable.]
at org.apache.hadoop.util.HttpExceptionUtils.validateResponse(HttpExceptionUtils.java:159)
at org.apache.hadoop.crypto.key.kms.KMSClientProvider.call(KMSClientProvider.java:545)
at org.apache.hadoop.crypto.key.kms.KMSClientProvider.call(KMSClientProvider.java:503)
at org.apache.hadoop.crypto.key.kms.KMSClientProvider.createKeyInternal(KMSClientProvider.java:676)
at org.apache.hadoop.crypto.key.kms.KMSClientProvider.createKey(KMSClientProvider.java:684)
at org.apache.hadoop.crypto.key.KeyShell$CreateCommand.execute(KeyShell.java:483)
at org.apache.hadoop.crypto.key.KeyShell.run(KeyShell.java:79)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.crypto.key.KeyShell.main(KeyShell.java:515)

In the error logs, it says to use purge option to permanently delete the key and undelete to recover the deleted key. But I was not able to find these options with hadoop key command. I googled it and I couldn’t figure out this issue. Finally I got the guidance from one guy from cloudera to execute the purge & undelete commands through rest api of keytrustee and he gave a nice explanation for my issue. I am briefly putting the solution for this exception below.

The delete operation on the Trustee key provider is a “soft delete”, meaning that is possible to “undelete” the key. It is also possible to “purge” the key to delete it permanently. Because these operations are not part of the standard Hadoop key provider API, they are not currently exposed through Hadoop KeyShell (hadoop key). However, you can call these operations directly via the Trustee key provider REST API.

See the examples below.

Use KeyShell to list existing keys

$ ./bin/hadoop key list -provider kms://http@localhost:16000/kms
 
Listing keys for KeyProvider: KMSClientProvider[http://localhost:16000/kms/v1/]
amal-testkey-1

Use KeyShell to delete an existing key

$ ./bin/hadoop key delete amal-testkey-1 -provider kms://http@localhost:16000/kms
 
Deleting key: ajy-testkey-1 from KeyProvider: KMSClientProvider[http://localhost:16000/kms/v1/]
amal-testkey-1 has been successfully deleted.
KMSClientProvider[http://localhost:16000/kms/v1/] has been updated.

Use KeyShell to verify the key was deleted

$ ./bin/hadoop key list -provider kms://http@localhost:16000/kms
 
Listing keys for KeyProvider: KMSClientProvider[http://localhost:16000/kms/v1/]
 

Use the KeyTrustee key provider REST API to undelete the deleted key

$ curl -L -d "trusteeOp=undelete" "http://localhost:16000/kms/v1/trustee/key/amal-testkey-1?user.name=amal&trusteeOp=undelete"

Use KeyShell to verify the key was restored

$ ./bin/hadoop key list -provider kms://http@localhost:16000/kms
 
Listing keys for KeyProvider: KMSClientProvider[http://localhost:16000/kms/v1/]
amal-testkey-1

Use the KeyTrustee key provider REST API to purge the restored key

$ curl L -d "trusteeOp=purge" "http://localhost:16000/kms/v1/trustee/key/amal-testkey-1?user.name=amal&trusteeOp=purge"

Use KeyShell to verify the key was deleted

$ ./bin/hadoop key list -provider kms://http@localhost:16000/kms
 
Listing keys for KeyProvider: KMSClientProvider[http://localhost:16000/kms/v1/]
 

Use the KeyTrustee key provider REST API to attempt to undelete the purged key

$ curl -L -d "trusteeOp=undelete" "http://localhost:16000/kms/v1/trustee/key/amal-testkey-1?user.name=amal&trusteeOp=undelete"

{
"RemoteException" : {
"message" : "Key with name amal-testkey-1 not found in com.cloudera.keytrustee.TrusteeKeyProvider@6d88562",
"exception" : "IOException",
"javaClassName" : "java.io.IOException"
}
}

Configure ACLs for KeyTrustee undelete, purge and migrate operations

ACLs for the KeyTrustee specific undelete, purge and migrate operations are configured in kts-acls.xml. Place this file in the same location as your kms-acls.xml file. See example below.

<property>
   <name>keytrustee.kms.acl.UNDELETE</name>
     <value>*</value>
       <description>
          ACL for undelete-key operations.
      </description>
</property>
 
<property>
  <name>keytrustee.kms.acl.PURGE</name>
    <value>*</value>
      <description>
         ACL for purge-key operations.
      </description>
</property>
 
<property>
  <name>keytrustee.kms.acl.MIGRATE</name>
    <value>*</value>
     <description> 
      ACL for purge-key operations.
     </description>
</property>
 

Note: In kerberized environments, the requests will be a little different. It will be in the following format

Eg :
curl -L --negotiate -u [username]  -d "trusteeOp=undelete" "http://localhost:16000/kms/v1/trustee/key/amal-testkey-1?user.name={username}&trusteeOp=undelete"

A Brief Overview of Elasticsearch

This is the era of NoSQL databases. Elasticsearch is one of the popular NoSQL databases. It is a document database. It stores records as JSON. Once you get the basics of elastic search, it is very easy to work with elasticsearch. You can become a master in elasticsearch in just few days. You should know about JSON for proceeding with elasticsearch. Here I am explaining the quick installation and basic operations in elasticsearch. This may help some beginners to just start with this. For learning elasticsearch, you don’t need any server. You can use your desktop/laptop for installling elasticsearch.

Step 1:
Download the elasticsearch installable from elasticsearch website
https://www.elastic.co/downloads/elasticsearch

Extract the file. Go to the bin folder and execute the elasticsearch script.
Linux users should execute the elasticsearch script and windows users should execute the elasticsearch.bat file
Now your elasticsearch will be up and by default the data will be stored by under the folder $ELASTICSEARCH_HOME/data.
Check the url http://localhost:9200 in your browser. If some json like below is coming means your elasticsearch instance is running properly

{
"status" : 200,
"name" : "Armor",
"cluster_name" : "elasticsearch",
"version" : {
"number" : "1.4.0",
"build_hash" : "bc94bd81298f81c656893ab1ddddd30a99356066",
"build_timestamp" : "2014-11-05T14:26:12Z",
"build_snapshot" : false,
"lucene_version" : "4.10.2"
},
"tagline" : "You Know, for Search"
}

Step 2:

Now we have to load some data to elasticsearch. For that we need to send some REST requests.
Install the REST Client plugin in your browser. (In mozilla firefox, a plugin named REST CLIENT will help you. In chrome, a plugin named POST MASTER will help you).

Step 3:

Get some sample data for loading. I am providing some sample data. You can use something else also. In elasticsearch, we keep data under an index. An index is analogous to a database in RDBMS. In RDBMS, we have databases, tables and records. Here we have Index, Types and Documents. All the documents will have a unique key called Id.

Here I am adding car details to an index “car”, type “details”. I am giving the Ids from 1.
For adding the first record. Send a PUT request with the following details

URL : http://localhost:9200/car/details/1
METHOD : PUT
BODY:
{
"carName": "Fabia",
"manufacturer": "Skoda",
"type": "mini"
}

Similarly you can add the second record

URL : http://localhost:9200/car/details/2
METHOD : PUT
BODY:
{
"carName": "Yeti",
"manufacturer": "Skoda",
"type": "XUV"
}

The full dataset is available in github. You can add the complete records like this.

Step 4:

If you want to update a record, just do a PUT request similar to data load with the corresponding ID and new record. It will
update the data with new record and you can see a new version number. The old record will not be available after this.

Eg: Suppose If I want to change the record with Id 1. I am changing the car name from Fabia to FabiaNew.

URL : http://localhost:9200/car/details/1
METHOD : PUT
BODY:
{
"carName": "FabiaNew",
"manufacturer": "Skoda",
"type": "mini"
}

Step 5:
For getting all the indexes from elasticsearch. Do the following GET request.

METHOD: GET
http://localhost:9200/_aliases

Step 6:
For getting the data from elasticsearch, we can query elasticsearch. For getting everything from elasticsearch, do the following query.

METHOD: GET
http://localhost:9200/_search
http://localhost:9200/car/_search

Step 7:

For detailed queries you can try the following.

Query for getting all the vehicles with manufacturer “Skoda”.

METHOD: POST
http://localhost:9200/_search

{"query":

{
"query_string" : {
"fields" : ["manufacturer"],
"query" : "Skoda"
}
}
}

Query for getting all the vehicles with manufacturer Skoda or Renault

METHOD: POST
http://localhost:9200/_search

{"query":
{
"query_string" : {
"fields" : ["manufacturer"],
"query" : "Skoda OR Renault"
}
}
}