How to extract a tar.gz file quickly in Linux

Recently I got a tar.gz file of around 30 GB and on extraction it will become approximately 4TB. I want to speed up the extraction as the normal extraction was taking approximately a day. I searched a lot and finally figured out a solution.

The solution was pigz. This is an advanced version of gzip. It uses multiple threads for reading, writing and checksum calculations. The extraction happens in a single thread. But overall performance is far better than the normal gzip.

The command to install pigz in CentOS or RHEL is given below. Ensure epel repository is enabled in your system

yum install pigz

The command to extract a tar.gz file using pigz is given below.

pigz -dc compressed.tar.gz | tar xf -

If you want to see the progress of the extraction process, you need to use Pipe Viewer (pv). PV (“Pipe Viewer”) is a tool for monitoring the progress of data through a pipeline. It can be inserted into any normal pipeline between two processes to give a visual indication of how quickly data is passing through, how long it has taken, how near to completion it is, and an estimate of how long it will be until completion.

Pipe viewer can be installed in CentOS or RHEL using the following command

yum install pv

Using pv, we can monitor the progress of the decompression process

pigz -dc compressed.tar.gz | pv | tar xf -

 

How to migrate docker images from one server to another without using a docker registry/repository ?

Copying docker image from one server to another server is an easy task. The following steps will explain you about this. Before getting into the actual steps, lets get the understanding of few terminologies.

What is a docker image ?

An image is an immutable master copy. We can correlate docker image with an ISO image of an operating system. Once we run this image, it will create a container. We can run any number of containers from the same image.

What is a docker container ?

Container is basically a running copy of the image with life. Alterations can be made on the container. Basically changes can be applied on top of the base image while running it as a container. A container can be called as a booted image.

Docker save, export and load commands

docker save will save a docker image to the disk. This saved file includes all the layers of images and the metadata required to chain these layers to rebuild the current image. So the docker save command will preserve the history of all the layers present in the current image. We can copy this saved file to another server to load the image and run containers.

The syntax is

docker save -o [filename] [imagename]:[version]

The above command will save the image into the given file name. You can also provide the complete path along with the file name.

The docker load command will load the image back from file into the system. To load this image from the file, use the following command.

docker load -i [saved image file name]

Docker export will create a snapshot of the container. Basically it will save the current state of the container as an image. It will not preserve the details of the layers present in the parent image of the container. This will save the container’s file system as a tar file. This command does not export the contents of volumes associated with the container.

Docker save needs to be performed on a docker image and docker export is performed on a docker container.

To copy a docker image from one host to another host in a single shot, the following command will help. For executing this command, the bzip2 package needs to be installed in your unix operating system

docker save [image]:[version] | bzip2 | ssh username@hostname 'bunzip2 | docker load'

Note: For installing bzip2 in centos/rhel, use the following command

yum install bzip2

For ubuntu

apt-get install bzip2

I hope this article helped you. 🙂

How to find and kill a process locking a particular port in Linux?

Sometimes because of some issue or bug, our application may stop working, but the port will be locked. This kind of issue is very common with MySQL server, Elasticsearch, WebServices, Tomcat etc. In such scenarios, we have to find the zombie process and kill it to free up the locked port.

How to find the process that locks the port?

Use the following command

netstat -tulpn | grep <port>

This output of this command will contain the process id. Now we just need to kill the process.

Verify the process

Before killing the process, figure out what process it is and ensure we are not killing any required processes.

ps -aux | grep <process id>

The output of the above command will give the details of the process.

How to Kill a process ?

After confirming the details, you can kill the process

kill -9 <process id>

Now verify whether the port is still locked or not by executing the netstat command again

How to auto connect OpenVPN during windows boot up?

Generally we establish VPN connection using OpenVPN using the connect option present in GUI application. Sometimes we may came across situations in which we need to enable vpn auto connect on the system boot.

I got a similar requirement. I have a desktop server which is located remotely and I want to access it from my laptop. The desktop will be accessible only through my vpn. So if someone turns off the desktop, during reboot, the vpn needs to be autoconnected so that I can access it from my network without any assistance from others. Here is the steps that I followed to achieve this. I created a task in the windows task scheduler. My operating system was Windows 10 (The same steps will work in all the recent versions of windows)

Step 1: Open Task Scheduler

Search for Task Scheduler and Open the Task Scheduler

openvpn_OpenTaskScheduler

 

Step 2: Click on Create Task

Once you open the Task Scheduler, you can see several options. Select Create Task option to create a new task.

openvpn_createtask

Step 3: Configure the Task details

Start create the task by filling the following details in the General section.

openvpn_taskdetails

Step 4: Add new Trigger to the Task

Trigger is basically the parameter that tells the system when to trigger this action. We need to create a new Trigger for this task. Click on New and create a trigger as explained in the next step.

openvpn_createtrigger

Step 5: Configure the Trigger

We will configure the trigger details in this section. Choose Begin the task: At Start up. This means the task will be triggered during the startup of the system. Further tweaking can be made by setting the parameters in the advanced settings section.

openvpn_trigger

Step 6: Create new Action

This is the main section. This is the action that gets triggered by the task. Here we need to select action as “Start a program”. 

openvpn_create_action

Step 7: Configure Action

Our program is the openvpn client. Browse to the openvpn client installation and select the openvpn-gui.exe. The main part is the arguments section. We need to specify the config file in which we need to connect. Here my config file name is amal.ovpn and it is located in the config directory of openvpn installation. If we miss this argument, the openvpn auto connect will not work. To test this command, the simple thing that we can do is by directly executing the command in the command line (Powershell is recommended).

Eg: Go to the bin directory of OpenVPN (C:\Program Files\OpenVPN\bin) and open powershell there.

Execute the following command. The “amal.ovpn” can be replaced with your vpn config file name.

openvpn_powershell_testing If the above command is working fine, complete the action configuration and save the details.

openvpn_action_info

Note: The amal.ovpn is the vpn configuration file and is located in the OpenVPN config directory which defaults to “C:\Program Files\OpenVPN\config”

After configuring this, click on ok and save the task. Then test this task by rebooting the system. I have configured this set up several times in several places and it worked perfectly.

Hope this article helped you 🙂 . If you are facing any issues, please comment on this post, I will be happy to help you.

 

 

What is Swap memory and How to clear Swap usage in Linux ?

What is Swap Space ?

Swap is a space on disk that is used by the system when the available memory in the RAM (physical memory) is completely utilized. This is basically to increase the available virtual memory in the system. The swap memory will be used once the physical memory is full. Since this is residing in the disk, the processing speed using this memory will be slow compared to the processing in physical memory (RAM).

Why we need swap space ?

Suppose we have a system with 4GB RAM. When we start the system the memory usage will be less. But as we open applications or start running processes, the memory utilization will increase. If it reaches the 4GB utilization, we will not be able to use any additional applications and we will have to wait to get some free space in the RAM. With swap memory, the allocated space in the disk will be used in case of any additional requirement and the applications will still run even after crossing the max limit of system RAM. As already explained, the performance of swap will be very slow as compared to RAM.

How the memory management works internally ?

The Linux kernel has a memory management process. This process monitors all the processes and identifies the less frequently used memory pages (or blocks). In case of additional memory requirement exceeding the RAM limit comes, this memory management program will utilize the space in system hard disk allocated for “swapping” or paging these less frequently used memory blocks. In this way the RAM will be freed up and the active memory for running live application will become available in the system.

How to clear the swap memory usage?

If you want to clear the swap memory, you can execute the following command in the terminal as root user.

swapoff -a && swapon -a

WARNING.!!!: Be careful doing this, as this may affect your system’s stability, especially if its already low on RAM. Better not to set these swap clearing scripts as cronjob.

Linux commands to check the disk utilization, size of directory or file

  • Command to check the disk utilization
df -h

The ‘-h’ option will provide the utilization in human readable format.

  • Command to check the size of a directory
du -sh <directory name>
  • Command to check the size of a file
du -sh <file name>
  • Command to check the size of files in a directory

Go inside the directory and execute the following command

du -sh *