Advertisements

How to extract a tar.gz file quickly in Linux

Recently I got a tar.gz file of around 30 GB and on extraction it will become approximately 4TB. I want to speed up the extraction as the normal extraction was taking approximately a day. I searched a lot and finally figured out a solution.

The solution was pigz. This is an advanced version of gzip. It uses multiple threads for reading, writing and checksum calculations. The extraction happens in a single thread. But overall performance is far better than the normal gzip.

The command to install pigz in CentOS or RHEL is given below. Ensure epel repository is enabled in your system

yum install pigz

The command to extract a tar.gz file using pigz is given below.

pigz -dc compressed.tar.gz | tar xf -

If you want to see the progress of the extraction process, you need to use Pipe Viewer (pv). PV (“Pipe Viewer”) is a tool for monitoring the progress of data through a pipeline. It can be inserted into any normal pipeline between two processes to give a visual indication of how quickly data is passing through, how long it has taken, how near to completion it is, and an estimate of how long it will be until completion.

Pipe viewer can be installed in CentOS or RHEL using the following command

yum install pv

Using pv, we can monitor the progress of the decompression process

pigz -dc compressed.tar.gz | pv | tar xf -

 

Advertisements

How to find and kill a process locking a particular port in Linux?

Sometimes because of some issue or bug, our application may stop working, but the port will be locked. This kind of issue is very common with MySQL server, Elasticsearch, WebServices, Tomcat etc. In such scenarios, we have to find the zombie process and kill it to free up the locked port.

How to find the process that locks the port?

Use the following command

netstat -tulpn | grep <port>

This output of this command will contain the process id. Now we just need to kill the process.

Verify the process

Before killing the process, figure out what process it is and ensure we are not killing any required processes.

ps -aux | grep <process id>

The output of the above command will give the details of the process.

How to Kill a process ?

After confirming the details, you can kill the process

kill -9 <process id>

Now verify whether the port is still locked or not by executing the netstat command again

What is Swap memory and How to clear Swap usage in Linux ?

What is Swap Space ?

Swap is a space on disk that is used by the system when the available memory in the RAM (physical memory) is completely utilized. This is basically to increase the available virtual memory in the system. The swap memory will be used once the physical memory is full. Since this is residing in the disk, the processing speed using this memory will be slow compared to the processing in physical memory (RAM).

Why we need swap space ?

Suppose we have a system with 4GB RAM. When we start the system the memory usage will be less. But as we open applications or start running processes, the memory utilization will increase. If it reaches the 4GB utilization, we will not be able to use any additional applications and we will have to wait to get some free space in the RAM. With swap memory, the allocated space in the disk will be used in case of any additional requirement and the applications will still run even after crossing the max limit of system RAM. As already explained, the performance of swap will be very slow as compared to RAM.

How the memory management works internally ?

The Linux kernel has a memory management process. This process monitors all the processes and identifies the less frequently used memory pages (or blocks). In case of additional memory requirement exceeding the RAM limit comes, this memory management program will utilize the space in system hard disk allocated for “swapping” or paging these less frequently used memory blocks. In this way the RAM will be freed up and the active memory for running live application will become available in the system.

How to clear the swap memory usage?

If you want to clear the swap memory, you can execute the following command in the terminal as root user.

swapoff -a && swapon -a

WARNING.!!!: Be careful doing this, as this may affect your system’s stability, especially if its already low on RAM. Better not to set these swap clearing scripts as cronjob.

Linux commands to check the disk utilization, size of directory or file

  • Command to check the disk utilization
df -h

The ‘-h’ option will provide the utilization in human readable format.

  • Command to check the size of a directory
du -sh <directory name>
  • Command to check the size of a file
du -sh <file name>
  • Command to check the size of files in a directory

Go inside the directory and execute the following command

du -sh *

 

Virtual environment in Python

What is a Virtual Environment ?

A virtual environment is a tool that helps developers to segregate and maintain the dependencies required by different projects by creating isolated python virtual environments.

Need for Virtual Environment ?

Suppose,  User A and User B are working on 2 projects.

The package requirements for user A is given below

Click==7.0
Flask==1.0.2
Flask-Cors==3.0.7
itsdangerous==1.1.0
Jinja2==2.10
MarkupSafe==1.1.0
six==1.12.0
Werkzeug==0.14.1

Also the package requirements for user B is given below.

Click==6.0
Flask==1.0.1
Flask-Cors==3.0.2
itsdangerous==1.1.0
Jinja2==2.10
MarkupSafe==1.1.0
six==1.11.0
Werkzeug==0.11.2

As you can see, the two developers use different versions of similar packages. Python does not have the ability to differentiate between multiple versions of the same package in the site-packages directory.  By default the package installation happens in the default site-packages directory of the python installation. In Unix like operating system, by default the location will be owned by the root user and a normal user will not be able to perform a package installation without elevated privileges.

Virtual environment plays its role in the following scenarios

  • To isolate the dependencies required for different projects
  • To maintain the base python packages untouched. In case of multi-user environment, upgrading or modifying a package might disrupt the operation of one or more projects.
  • To enable easy access for the installation and management of python packages to the end users without enabling system level elevated privileges.
  • To easily manage the dependencies used in a specific project. We can copy the virtual environment to another system of the same version. Also it is easy to replicate the environment by dumping the package list (pip freeze).

How to create a virtual environment ?

For creating a virtual environment, we need the virtual environment package installed in the base python environment.

As root user or with elevated privilege, execute the following command

pip install virtualenv

Then create the virtual environment with the following command. The virtualenv command will be available only if the package was installed in the base python.

virtualenv <path for virtual environment>

You can specify any writable location as the path for virtual environment. All the virtual environment related files and packages will be installed in this directory. It may take few seconds complete the virtual environment setup.

Now type

which python

this will be still pointing to the base python. For using the virtual environment, we need to activate the environment.

source <path of virtual environment>/bin/activate

The above command will activate the virtual environment in the current session. For making it enabled in all sessions by default, add these lines in the .bashrc file.

Now again type which python check the result. It will be pointing to the newly created virtual environment.

For deactivating the environment, simply type deactivate in the command line.

 

How to check whether a Raspberry Pi is 32 bit or 64 bit ?

The latest version of Raspberry Pi comes with 64 bit CPU, but prior to that it was with 32 bit CPU. Some softwares and applications are dependent on CPU and OS architecture.

There are various options to check the architecture.

Method 1:

type the following command and check the response

uname -m

You will get a response something like armv7l or armv8.

ARMv7 and below are 32-bit. AMRv8 introduces the 64-bit instruction set.

Method 2:

Install lshw using the command

apt-get install lshw

Then type the command lshw.  You will be able to find the architecture from the response of the command.

How to clear/delete the cached Kerberos ticket ?

In Linux

kdestroy

 

In Windows

klist purge

How to containerize a python flask application ?

Containerization is one of the fast growing and powerful technologies in software Industry. With this technology, user can build, ship and deploy the applications (standalone and distributed) seamlessly. Here are the simple steps to containerize a python flask application.

Step 1:
Develop your flask application. Here for demonstration I am using a very simple flask application. You can use yours and proceed with the remaining steps. If you are new to this technology, I would recommend you to start with this simple program. As usual with all the tutorials, here also I am using a “Hello World” program. Since we are discussing about Docker, we can call it as “Hello Docker”. I will demonstrate the containerization of an advanced application in my next post.

import json
from flask import Flask

app = Flask(__name__)

@app.route("/requestme", methods = ["GET"])
def hello():
    response = {"message":"Hello Docker.!!"}
    return json.dumps(response)


if __name__ == '__main__':
    app.run(host="0.0.0.0", port=9090, debug=True)

Step 2:
Ensure the project is properly packaged and the dependencies are mentioned in the requirements.txt. A properly packaged project is easy to manage. All the dependent packages are required in the code execution environment. The dependencies will be installed based on the requirements.txt. So prepare the dependency list properly and add it in the requirements.txt file. Since our program is a simple one module application, there is nothing to package much. Here I am keeping the python file and the requirements.txt in a folder named myproject (Not using any package structure)

 

Step 3:
Create the Dockerfile. The file should be with the name “Dockerfile“. Here I have used python 2 base image. If you use python:3, then python 3 will be the base image. So based on your requirement, you can select the base image.

FROM python:2
ADD myproject /
WORKDIR /myproject
RUN pip install -r requirements.txt
CMD [ "python", ".myflaskapp.py" ]

Ensure you create the Dockerfile without any extension. Docker may not recognize the file with .txt extension.

Step 4:
Build an image using the Dockerfile. Ensure we keep the python project and the Dockerfile in proper locations.
Run the following command from the location where the Dockerfile is kept. The syntax of the command is given below

docker build -t [imagename]:[tag] [location]

The framed command is given below. Here I am executing the build command from the same location as that of the Dockerfile and the project, so I am using ‘dot’ as the location. If the Docker file is located in a different location, you can specify it using the option -f or using –file.

docker build -t myflaskapp:latest .

Step 5:
Run a container from the image

docker run -d -p 9090:9090 --name myfirstapp myflask:latest

Step 6:
Verify the application
List the running containers

docker ps | grep myfirstapp

Now your application is containerized.

pythonContainer_docker

Step 7:
Save the docker image locally. The following command will save the docker image as a tar file. You can take this file to any other environment and use it.

docker save myflaskapp > myflaskapp.tar

Save the docker image to Dockerhub also.

In this way you can ship and run your application anywhere.

Configure Network in CentOS / RHEL from command line

How many of you are aware of a text user interface for network configuration ?. A tool called NMTUI (Network Manager Text User Interface) is available in CentOS and Redhat systems. You can simply open this by typing nmtui in the command line.

If this command is not available, you have to install the NetworkManager-tui package.

yum install NetworkManager-tui

If you type nmtui command in command line, the following console will open up. You can configure the network configurations in the opened console. You can

nmtui

nmtui

Disable Sleep mode in CentOS7/RHEL7 laptop on lid close

The following tip will help you to disable the powersaving or sleep mode behavior of your CentOS or RHEL laptop or desktop. If GUI is present, the following steps will help.

Applications => Utilities => Tweak Tool => Shell => Don't suspend on lid close => ON

But if GUI is not installed, then the only option is to disable this from the commandline. It is very easy, don’t worry. Who cares about the GUI in Linux. ? 🙂 (I love the black screen)

Open /etc/systemd/logind.conf, then make edit in the following configuration. By default, the value of this config will be suspend

HandleLidSwitch=ignore

man logind.conf will provide the complete details about this configuration file. Hope this tip helps.