Programmatic way to identify the status of the namenode in an HA enabled hadoop cluster

In an namenode HA enabled hadoop cluster, one of the namenodes will be active and the other will be standby. If you want to perform some operations on HDFS programmatically, some of the the libraries or packages need the details of active namenode (some of the packages in python need the details of active namenode, they will not support the nameservice). In this case, the easiest way to get the status is to issue a GET request similar to the one given below on each of the namenodes. This will help us to identify the status of each namenode.

GET REQUEST

curl 'http://namenode.1.host:50070/jmx?qry=Hadoop:service=NameNode,name=NameNodeStatus'

SAMPLE OUTPUT

{
"beans" : [ {
"name" : "Hadoop:service=NameNode,name=NameNodeStatus",
"modelerType" : "org.apache.hadoop.hdfs.server.namenode.NameNode",
"State" : "active",
"SecurityEnabled" : false,
"NNRole" : "NameNode",
"HostAndPort" : "namenode.1.host:8020",
"LastHATransitionTime" : 0
} ]
}

(13)Permission denied: access to /index.html denied

I was getting the error “(13)Permission denied: access to /index.html denied” after deploying a static website in Apache webserver.

Solution:

Apache could not access those directories because of the SELinux security settings. Execute the below command

chcon -R -t httpd_sys_content_t  <document directory>

Python code to list all the running EC2 instances across all regions in an AWS account

This code snippet will help you to get the list of all running EC2 instances across all regions in an AWS account. I have used python boto3 package for developing the code. This code will dynamically pick up all the aws ec2 regions. So the code will work perfectly without any modification even if a new region gets added to the AWS.

Note: Only the basic api calls just to list the instance details are mentioned in this program . Proper coding convention is not followed . 🙂

Error: Cannot retrieve metalink for repository: epel. Please verify its path and try again

Problem:

Error: Cannot retrieve metalink for repository: epel. Please verify its path and try again

Faced this error in CentOS 6.3. This issue started after installing the epel-repo

Solution:

You need to update ca-certificates package. Before that disable all the repos with https that are failing.

Here in my case,  epel-repo is failing, so I have to disable only epel repo:

yum --disablerepo=epel -y update  ca-certificates

The above fix helped me to resolve the issue.

Unique ID in Raspberry Pi

Sometimes we might need a unique id from a raspberry PI. We don’t have to worry about generating a unique id for every device. There is a simple way to use an already existing unique number within the device. The serial number of the chip will be good enough to use as a unique key.

The following command will give the details of the cpu. We can find a serial number from this details and can be used as a unique id.

pi@raspberrypi:~ $ cat /proc/cpuinfo
processor   : 0
model name   : ARMv7 Processor rev 4 (v7l)
BogoMIPS   : 38.40
Features   : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae evtstrm crc32
CPU implementer   : 0x41
CPU architecture: 7
CPU variant   : 0x0
CPU part   : 0xd03
CPU revision   : 4

processor   : 1
model name   : ARMv7 Processor rev 4 (v7l)
BogoMIPS   : 38.40
Features   : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae evtstrm crc32
CPU implementer   : 0x41
CPU architecture: 7
CPU variant   : 0x0
CPU part   : 0xd03
CPU revision   : 4

processor   : 2
model name   : ARMv7 Processor rev 4 (v7l)
BogoMIPS   : 38.40
Features   : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae evtstrm crc32
CPU implementer   : 0x41
CPU architecture: 7
CPU variant   : 0x0
CPU part   : 0xd03
CPU revision   : 4

processor   : 3
model name   : ARMv7 Processor rev 4 (v7l)
BogoMIPS   : 38.40
Features   : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae evtstrm crc32
CPU implementer   : 0x41
CPU architecture: 7
CPU variant   : 0x0
CPU part   : 0xd03
CPU revision   : 4

Hardware   : BCM2709
Revision   : a02082
Serial      : 00000000xxxxxxxx

Hope this info is helpful. !!

How to check the entries in fstab without system reboot

/etc/fstab contains information about the disks. It has the details about where the partitions and storage devices should be mounted. We usually configure automount, disk quota, mount points etc in this fstab.

Inorder to test the entries or modifications in fstab without restart the following commands will be helpful

mount -a

The above command will mount all the filesystems mentioned in the fstab. This is just like a refresh command to activate the entries in fstab.

mount -fav

The above command will help if you don’t want to apply the modifications in the fstab and want to validate the entries only.  This will just fake the entries in the fstab without applying the changes. This is a very useful command.

 

 

Add partitions to hive table with location as S3

Recently I tried to add a partition to a hive table with S3 as the storage. The command I tried is given below.

ALTER table mytable ADD PARTITION (testdate='2015-03-05') location 's3a://XXXACCESS-KEYXXXXX:XXXSECRET-KEYXXX@bucket-name/DATA/mytable/testdate=2015-03-05';

I got the following exceptions

Error: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.SentryFilterDDLTask. MetaException(message:Got exception: org.apache.hadoop.fs.FileAlreadyExistsException Can't make directory for path 's3a://XXXACCESS-KEYXXXXX:XXXSECRET-KEYXXX@bucket-name/DATA/mytable' since it is a file.) (state=08S01,code=1)
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:s3a://XXXACCESS-KEYXXXXX:XXXSECRET-KEYXXX@bucket-name/DATA/mytable/testdate=2015-03-05 is not a directory or unable to create one)

Solution:

Use S3n instead of S3a. It will work. So the S3 url should be

s3n://XXXACCESS-KEYXXXXX:XXXSECRET-KEYXXX@bucket-name/DATA/mytable/testdate=2015-03-05

 

 

How to attach a new EBS to an EC2 instance

Nowadays majority of us are using some cloud services. Amazon Web Services is one of the popular provider among all the other cloud service providers. Just like we upgrade our harddisk or mounting new drives to physical machines, we can attach new block storages to Amazon EC2 also. Amazon provides a service called EBS (Elastic Block Storage). There are various types of EBS with various speed and cost. Example are magnetic, SSD etc.

Attaching a new EBS to a running EC2 instance is very simple. We can do this programatically as well as using the console. Here I am explaining the basic steps to perform this operation using the console.

  1. Launch an EBS in the same region and same availability zone as that of the EC2 instance
  2. Note down the instance id of the EC2 instance
  3. Attach the EBS to the EC2. This can be done by using the attach option available in the EBS. The EBS will be listed under the Volumes section in EC2 service page of AWS console.
  4. Login to the EC2 instance and switch to the root user
  5. Type lsblk to list all the block devices
  6. Identify the new block device.
  7. Create a new directory to mount the EBS.
  8. Format the newly mounted storage. The command is mkfs -t ext4 /dev/<device-name>
  9. Mount the EBS on the directory. The command is mount /dev/<device-name>  <mount-dir>
  10. Check for the new storage. The command is df -h

 

 

Disable SELinux without reboot

To disable the SELinux by modifying /etc/sysconfig/selinux file, we have to perform a reboot. In some cases, we may not be able to perform a reboot because this involves a downtime of the system. In this situations we can disable SELinux by using a simple command. This will not disable SELinux permanently. The effect will last until the next reboot, but you have the option to edit the selinux file so that it will be in the disabled state even after  the reboot also. The steps for disabling selinux permanently are explained in my previous post.

The command the check the status of SELinux is given below.

sestatus

This may show enforcing or permissive or disabled. In permissive mode, SELinux will not block anything, but merely warns you. The line will show enforcing when it’s actually blocking.

To disable the SELinux temporarily we can use the following command. This has to be executed as root or using sudo.

setenforce 0

After this command execution we can check the status of selinux using sestatus command. If it is permissive, we are good to go. 🙂