Error: Cannot retrieve metalink for repository: epel. Please verify its path and try again


Error: Cannot retrieve metalink for repository: epel. Please verify its path and try again

Faced this error in CentOS 6.3. This issue started after installing the epel-repo


You need to update ca-certificates package. Before that disable all the repos with https that are failing.

Here in my case,  epel-repo is failing, so I have to disable only epel repo:

yum --disablerepo=epel -y update  ca-certificates

The above fix helped me to resolve the issue.


Unique ID in Raspberry Pi

Sometimes we might need a unique id from a raspberry PI. We don’t have to worry about generating a unique id for every device. There is a simple way to use an already existing unique number within the device. The serial number of the chip will be good enough to use as a unique key.

The following command will give the details of the cpu. We can find a serial number from this details and can be used as a unique id.

pi@raspberrypi:~ $ cat /proc/cpuinfo
processor   : 0
model name   : ARMv7 Processor rev 4 (v7l)
BogoMIPS   : 38.40
Features   : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae evtstrm crc32
CPU implementer   : 0x41
CPU architecture: 7
CPU variant   : 0x0
CPU part   : 0xd03
CPU revision   : 4

processor   : 1
model name   : ARMv7 Processor rev 4 (v7l)
BogoMIPS   : 38.40
Features   : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae evtstrm crc32
CPU implementer   : 0x41
CPU architecture: 7
CPU variant   : 0x0
CPU part   : 0xd03
CPU revision   : 4

processor   : 2
model name   : ARMv7 Processor rev 4 (v7l)
BogoMIPS   : 38.40
Features   : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae evtstrm crc32
CPU implementer   : 0x41
CPU architecture: 7
CPU variant   : 0x0
CPU part   : 0xd03
CPU revision   : 4

processor   : 3
model name   : ARMv7 Processor rev 4 (v7l)
BogoMIPS   : 38.40
Features   : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae evtstrm crc32
CPU implementer   : 0x41
CPU architecture: 7
CPU variant   : 0x0
CPU part   : 0xd03
CPU revision   : 4

Hardware   : BCM2709
Revision   : a02082
Serial      : 00000000xxxxxxxx

Hope this info is helpful. !!

How to check the entries in fstab without system reboot

/etc/fstab contains information about the disks. It has the details about where the partitions and storage devices should be mounted. We usually configure automount, disk quota, mount points etc in this fstab.

Inorder to test the entries or modifications in fstab without restart the following commands will be helpful

mount -a

The above command will mount all the filesystems mentioned in the fstab. This is just like a refresh command to activate the entries in fstab.

mount -fav

The above command will help if you don’t want to apply the modifications in the fstab and want to validate the entries only.  This will just fake the entries in the fstab without applying the changes. This is a very useful command.



Add partitions to hive table with location as S3

Recently I tried to add a partition to a hive table with S3 as the storage. The command I tried is given below.

ALTER table mytable ADD PARTITION (testdate='2015-03-05') location 's3a://XXXACCESS-KEYXXXXX:XXXSECRET-KEYXXX@bucket-name/DATA/mytable/testdate=2015-03-05';

I got the following exceptions

Error: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.SentryFilterDDLTask. MetaException(message:Got exception: org.apache.hadoop.fs.FileAlreadyExistsException Can't make directory for path 's3a://XXXACCESS-KEYXXXXX:XXXSECRET-KEYXXX@bucket-name/DATA/mytable' since it is a file.) (state=08S01,code=1)
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:s3a://XXXACCESS-KEYXXXXX:XXXSECRET-KEYXXX@bucket-name/DATA/mytable/testdate=2015-03-05 is not a directory or unable to create one)


Use S3n instead of S3a. It will work. So the S3 url should be




How to attach a new EBS to an EC2 instance

Nowadays majority of us are using some cloud services. Amazon Web Services is one of the popular provider among all the other cloud service providers. Just like we upgrade our harddisk or mounting new drives to physical machines, we can attach new block storages to Amazon EC2 also. Amazon provides a service called EBS (Elastic Block Storage). There are various types of EBS with various speed and cost. Example are magnetic, SSD etc.

Attaching a new EBS to a running EC2 instance is very simple. We can do this programatically as well as using the console. Here I am explaining the basic steps to perform this operation using the console.

  1. Launch an EBS in the same region and same availability zone as that of the EC2 instance
  2. Note down the instance id of the EC2 instance
  3. Attach the EBS to the EC2. This can be done by using the attach option available in the EBS. The EBS will be listed under the Volumes section in EC2 service page of AWS console.
  4. Login to the EC2 instance and switch to the root user
  5. Type lsblk to list all the block devices
  6. Identify the new block device.
  7. Create a new directory to mount the EBS.
  8. Format the newly mounted storage. The command is mkfs -t ext4 /dev/<device-name>
  9. Mount the EBS on the directory. The command is mount /dev/<device-name>  <mount-dir>
  10. Check for the new storage. The command is df -h



Disable SELinux without reboot

To disable the SELinux by modifying /etc/sysconfig/selinux file, we have to perform a reboot. In some cases, we may not be able to perform a reboot because this involves a downtime of the system. In this situations we can disable SELinux by using a simple command. This will not disable SELinux permanently. The effect will last until the next reboot, but you have the option to edit the selinux file so that it will be in the disabled state even after  the reboot also. The steps for disabling selinux permanently are explained in my previous post.

The command the check the status of SELinux is given below.


This may show enforcing or permissive or disabled. In permissive mode, SELinux will not block anything, but merely warns you. The line will show enforcing when it’s actually blocking.

To disable the SELinux temporarily we can use the following command. This has to be executed as root or using sudo.

setenforce 0

After this command execution we can check the status of selinux using sestatus command. If it is permissive, we are good to go. 🙂

Disable SELinux in CentOS and RHEL

Security-Enhanced Linux (SELinux) is a security architecture integrated into the 2.6.x kernel using the Linux Security Modules. It is a project of the United States National Security Agency (NSA) and the SELinux community. SELinux integration into Red Hat Enterprise Linux was a joint effort between the NSA and Red Hat.

Most of the application needs SELinux to be turned off. Turning off selinux is simple. You can use the following steps to turn off selinux in RHEL or CentOS 6 and 7 operating systems.

Open the file /etc/sysconfig/selinux . The contents will be similar as below.

# This file controls the state of SELinux on the system.
# SELINUX= can take one of these three values:
# enforcing - SELinux security policy is enforced.
# permissive - SELinux prints warnings instead of enforcing.
# disabled - SELinux is fully disabled.
# SELINUXTYPE= type of policy in use. Possible values are:
# targeted - Only targeted network daemons are protected.
# strict - Full SELinux protection.


The contents are self explanatory. Change the value of SELINUX as disabled and save the file. Then reboot the system.

Good Quote.!!

“A beginning programmer writes her programs like an ant builds her hill, one piece at a time, without thought for the bigger structure. Her programs will be like loose sand. They may stand for a while, but growing too big they fall apart.

Realizing this problem, the programmer will start to spend a lot of time thinking about structure. Her programs will be rigidly structured, like rock sculptures. They are solid, but when they must change, violence must be done to them.

The master programmer knows when to apply structure and when to leave things in their simple form. Her programs are like clay, solid yet malleable.”

— Master Yuan-Ma, The Book of Programming

Heterogeneous storages in HDFS

From hadoop 2.3.0 onwards, hdfs supports heterogeneous storage. What is this heterogeneous storage? What are the advantages of using this?.

Hadoop came as a processing system for processing and storing huge data, a scalable batch processing system. But now it became the platform for DataLake for Enterprises. In large enterprises, various types of data needs to be stored and processed for advanced analytics. Some of these data are required frequently, some are not required frequently, some are required very rarely. If we store all these in the same platform or hardware, the cost will be more. For example, if we are using a cluster in AWS. We have EC2 nodes for our cluster nodes. EC2 uses EBS and ephemeral storage. Depending upon the type of storage, the cost varies. S3 storage is cheaper than EBS storage, but access speed will be less. Similarly glacier will be cheaper compared to S3, but again the data retrieval will take time. Similarly, if we want to keep data in different storage types depending upon the priority and requirement, we can use this feature in hadoop. This feature was not available in earlier versions of hadoop. This is available in hadoop version 2.3.0 onwards. Now datanode can be defined as a collection of storages. Various storage policies available in hadoop are Hot, Warm, Cold, All_SSD, One_SSD and Lazy_Persist.

  • Hot – for both storage and compute. The data that is popular and still being used for processing will stay in this policy. When a block is hot, all replicas are stored in DISK.
  • Cold – only for storage with limited compute. The data that is no longer being used, or data that needs to be archived is moved from hot storage to cold storage. When a block is cold, all replicas are stored in ARCHIVE.
  • Warm – partially hot and partially cold. When a block is warm, some of its replicas are stored in DISK and the remaining replicas are stored in ARCHIVE.
  • All_SSD – for storing all replicas in SSD.
  • One_SSD – for storing one of the replicas in SSD. The remaining replicas are stored in DISK.
  • Lazy_Persist – for writing blocks with single replica in memory. The replica is first written in RAM_DISK and then it is lazily persisted in DISK.

Recovering a corrupted EC2 instance

Amazon Web Services is one of the most popular cloud service providers. I am a customer of Amazon. I like the services provided by Amazon very much. Compared to other cloud service providers AWS is simple, secure and advanced. I use EC2 machines for my project related activities as well as my personal experiments. Since I mostly work on open source software, 99.99 % of my EC2 instances are Linux instances. The only way to access these instances is through ssh. I use putty as the ssh client. If something happens to the ssh server, we will not be able to access the server. Sometimes the ssh server crashes due to overload. This can be resolved by rebooting the instance.

Sometimes because of wrong configs in the sshd config file, the ssh server may stop. The ssh server will not start until we make that file proper and restart the service. But for making these changes we have to access the machine.

By default we don’t have direct root login into the machine. We usually login to one user which is a sudo user and using sudo privileges, we access the root. If something happens to the sudoers file or if some wrong entry made in sudoers file, the root access will be revoked.

These are some of the commonly occurred situations where users loose access or super user privilege in the ec2 machine. Most of the users terminate and leave the instance in this situation.

If the instance is an EBS backed instance, we don’t have to terminate and leave the machines in this kind of situations. We can recover these instances. It is simple and can be done in few steps. If the instance is with ephemeral storage, we cannot do anything, because shutting down will clear all the data in the instance.

  1. Start a new instance in the same availability zone as that of the EBS of the broken machine. Micro or nano instance type is fine. If you already have an instance, no need of this instance.
  2. Stop the broken machine. Note down the mount locations
  3. Detach the EBS from the instance.
  4. Attach the EBS to the second EC2 instance (The newly launched one)
  5. Mount the EBS to some directory in the second EC2 instance.
  6. Navigate through the files and directories and make the required changes.
  7. Unmount the EBS
  8. Detach the EBS from the second instance
  9. Attach the EBS to the first instance
  10. Use the same mount location as that of the orginal
  11. Start the instance.

This should fix the problem.