HDFS Data encryption is an excellent feature that came recently. With this we can encrypt the data in hdfs. We can create multiple encryption zones with different encryption keys. In this way, we can secure the data in hdfs properly. For more details, you can visit these websites. Reference1, Reference2
I am using a cluster installed with CDH. I created some encryption keys and zones.
The command I used for creating a key is given below.
# As the normal user, create a new encryption key hadoop key create amalKey # As the super user, create a new empty directory and make it an encryption zone hadoop fs -mkdir /user/amal hdfs crypto -createZone -keyName amalKey -path /user/amal # chown it to the normal user hadoop fs -chown amal:hadoop /user/amal # As the normal user, put a file in, read it out hadoop fs -put test.txt /user/amal/ hadoop fs -cat /user/amal/test.tx
After some days, I deleted the encryption zone and I deleted the encryption key also.
The command I used for deleting the encryption key is given below
hadoop key delete <key-name>
After the deletion, I tried creating the key with the same name. But I got an exception that the key is still present in the disabled state. When I list the keys, I am not able to see the key. The exception that I got was given below.
amalKey has not been created. java.io.IOException: HTTP status [500], exception [com.cloudera.keytrustee.TrusteeKeyProvider$DuplicateKeyException], message [Key with name "amalKey" already exists in "com.cloudera.keytrustee.TrusteeKeyProvider@6d88562. Key exists but has been disabled. Use undelete to enable.] java.io.IOException: HTTP status [500], exception [com.cloudera.keytrustee.TrusteeKeyProvider$DuplicateKeyException], message [Key with name "amalKey" already exists in "com.cloudera.keytrustee.TrusteeKeyProvider@6d88562. Key exists but has been disabled. Use undelete to enable.] at org.apache.hadoop.util.HttpExceptionUtils.validateResponse(HttpExceptionUtils.java:159) at org.apache.hadoop.crypto.key.kms.KMSClientProvider.call(KMSClientProvider.java:545) at org.apache.hadoop.crypto.key.kms.KMSClientProvider.call(KMSClientProvider.java:503) at org.apache.hadoop.crypto.key.kms.KMSClientProvider.createKeyInternal(KMSClientProvider.java:676) at org.apache.hadoop.crypto.key.kms.KMSClientProvider.createKey(KMSClientProvider.java:684) at org.apache.hadoop.crypto.key.KeyShell$CreateCommand.execute(KeyShell.java:483) at org.apache.hadoop.crypto.key.KeyShell.run(KeyShell.java:79) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.crypto.key.KeyShell.main(KeyShell.java:515)
In the error logs, it says to use purge option to permanently delete the key and undelete to recover the deleted key. But I was not able to find these options with hadoop key command. I googled it and I couldn’t figure out this issue. Finally I got the guidance from one guy from cloudera to execute the purge & undelete commands through rest api of keytrustee and he gave a nice explanation for my issue. I am briefly putting the solution for this exception below.
The delete operation on the Trustee key provider is a “soft delete”, meaning that is possible to “undelete” the key. It is also possible to “purge” the key to delete it permanently. Because these operations are not part of the standard Hadoop key provider API, they are not currently exposed through Hadoop KeyShell (hadoop key). However, you can call these operations directly via the Trustee key provider REST API.
See the examples below.
Use KeyShell to list existing keys
$ ./bin/hadoop key list -provider kms://http@localhost:16000/kms Listing keys for KeyProvider: KMSClientProvider[http://localhost:16000/kms/v1/] amal-testkey-1
Use KeyShell to delete an existing key
$ ./bin/hadoop key delete amal-testkey-1 -provider kms://http@localhost:16000/kms Deleting key: ajy-testkey-1 from KeyProvider: KMSClientProvider[http://localhost:16000/kms/v1/] amal-testkey-1 has been successfully deleted. KMSClientProvider[http://localhost:16000/kms/v1/] has been updated.
Use KeyShell to verify the key was deleted
$ ./bin/hadoop key list -provider kms://http@localhost:16000/kms Listing keys for KeyProvider: KMSClientProvider[http://localhost:16000/kms/v1/]
Use the KeyTrustee key provider REST API to undelete the deleted key
$ curl -L -d "trusteeOp=undelete" "http://localhost:16000/kms/v1/trustee/key/amal-testkey-1?user.name=amal&trusteeOp=undelete"
Use KeyShell to verify the key was restored
$ ./bin/hadoop key list -provider kms://http@localhost:16000/kms Listing keys for KeyProvider: KMSClientProvider[http://localhost:16000/kms/v1/] amal-testkey-1
Use the KeyTrustee key provider REST API to purge the restored key
$ curl L -d "trusteeOp=purge" "http://localhost:16000/kms/v1/trustee/key/amal-testkey-1?user.name=amal&trusteeOp=purge"
Use KeyShell to verify the key was deleted
$ ./bin/hadoop key list -provider kms://http@localhost:16000/kms Listing keys for KeyProvider: KMSClientProvider[http://localhost:16000/kms/v1/]
Use the KeyTrustee key provider REST API to attempt to undelete the purged key
$ curl -L -d "trusteeOp=undelete" "http://localhost:16000/kms/v1/trustee/key/amal-testkey-1?user.name=amal&trusteeOp=undelete" { "RemoteException" : { "message" : "Key with name amal-testkey-1 not found in com.cloudera.keytrustee.TrusteeKeyProvider@6d88562", "exception" : "IOException", "javaClassName" : "java.io.IOException" } }
Configure ACLs for KeyTrustee undelete, purge and migrate operations
ACLs for the KeyTrustee specific undelete, purge and migrate operations are configured in kts-acls.xml. Place this file in the same location as your kms-acls.xml file. See example below.
<property> <name>keytrustee.kms.acl.UNDELETE</name> <value>*</value> <description> ACL for undelete-key operations. </description> </property> <property> <name>keytrustee.kms.acl.PURGE</name> <value>*</value> <description> ACL for purge-key operations. </description> </property> <property> <name>keytrustee.kms.acl.MIGRATE</name> <value>*</value> <description> ACL for purge-key operations. </description> </property>
Note: In kerberized environments, the requests will be a little different. It will be in the following format
Eg : curl -L --negotiate -u [username] -d "trusteeOp=undelete" "http://localhost:16000/kms/v1/trustee/key/amal-testkey-1?user.name={username}&trusteeOp=undelete"
Great article. You don’t happen to have an example on how to set up KMS in HA mode and how to handle keys across multiple instances?