Advertisements

RJDBC java.lang.OutOfMemoryError

You might see the below error while making jdbc connections from R programs.

java.lang.OutOfMemoryError: Java heap space

If you face java heap size exceptions in RJDBC connections like above, simply increase the JAVA heap size from your R program. Sample snippet is given below.

options(java.parameters = "-Xmx8048m")
library("RJDBC")

or

options(java.parameters = "-Xmx8g")
library("RJDBC")

Hope this helps you.

Advertisements

How to extract a tar.gz file quickly in Linux

Recently I got a tar.gz file of around 30 GB and on extraction it will become approximately 4TB. I want to speed up the extraction as the normal extraction was taking approximately a day. I searched a lot and finally figured out a solution.

The solution was pigz. This is an advanced version of gzip. It uses multiple threads for reading, writing and checksum calculations. The extraction happens in a single thread. But overall performance is far better than the normal gzip.

The command to install pigz in CentOS or RHEL is given below. Ensure epel repository is enabled in your system

yum install pigz

The command to extract a tar.gz file using pigz is given below.

pigz -dc compressed.tar.gz | tar xf -

If you want to see the progress of the extraction process, you need to use Pipe Viewer (pv). PV (“Pipe Viewer”) is a tool for monitoring the progress of data through a pipeline. It can be inserted into any normal pipeline between two processes to give a visual indication of how quickly data is passing through, how long it has taken, how near to completion it is, and an estimate of how long it will be until completion.

Pipe viewer can be installed in CentOS or RHEL using the following command

yum install pv

Using pv, we can monitor the progress of the decompression process

pigz -dc compressed.tar.gz | pv | tar xf -

 

How to migrate docker images from one server to another without using a docker registry/repository ?

Copying docker image from one server to another server is an easy task. The following steps will explain you about this. Before getting into the actual steps, lets get the understanding of few terminologies.

What is a docker image ?

An image is an immutable master copy. We can correlate docker image with an ISO image of an operating system. Once we run this image, it will create a container. We can run any number of containers from the same image.

What is a docker container ?

Container is basically a running copy of the image with life. Alterations can be made on the container. Basically changes can be applied on top of the base image while running it as a container. A container can be called as a booted image.

Docker save, export and load commands

docker save will save a docker image to the disk. This saved file includes all the layers of images and the metadata required to chain these layers to rebuild the current image. So the docker save command will preserve the history of all the layers present in the current image. We can copy this saved file to another server to load the image and run containers.

The syntax is

docker save -o [filename] [imagename]:[version]

The above command will save the image into the given file name. You can also provide the complete path along with the file name.

The docker load command will load the image back from file into the system. To load this image from the file, use the following command.

docker load -i [saved image file name]

Docker export will create a snapshot of the container. Basically it will save the current state of the container as an image. It will not preserve the details of the layers present in the parent image of the container. This will save the container’s file system as a tar file. This command does not export the contents of volumes associated with the container.

Docker save needs to be performed on a docker image and docker export is performed on a docker container.

To copy a docker image from one host to another host in a single shot, the following command will help. For executing this command, the bzip2 package needs to be installed in your unix operating system

docker save [image]:[version] | bzip2 | ssh username@hostname 'bunzip2 | docker load'

Note: For installing bzip2 in centos/rhel, use the following command

yum install bzip2

For ubuntu

apt-get install bzip2

I hope this article helped you. 🙂

How to find and kill a process locking a particular port in Linux?

Sometimes because of some issue or bug, our application may stop working, but the port will be locked. This kind of issue is very common with MySQL server, Elasticsearch, WebServices, Tomcat etc. In such scenarios, we have to find the zombie process and kill it to free up the locked port.

How to find the process that locks the port?

Use the following command

netstat -tulpn | grep <port>

This output of this command will contain the process id. Now we just need to kill the process.

Verify the process

Before killing the process, figure out what process it is and ensure we are not killing any required processes.

ps -aux | grep <process id>

The output of the above command will give the details of the process.

How to Kill a process ?

After confirming the details, you can kill the process

kill -9 <process id>

Now verify whether the port is still locked or not by executing the netstat command again

How to auto connect OpenVPN during windows boot up?

Generally we establish VPN connection using OpenVPN using the connect option present in GUI application. Sometimes we may came across situations in which we need to enable vpn auto connect on the system boot.

I got a similar requirement. I have a desktop server which is located remotely and I want to access it from my laptop. The desktop will be accessible only through my vpn. So if someone turns off the desktop, during reboot, the vpn needs to be autoconnected so that I can access it from my network without any assistance from others. Here is the steps that I followed to achieve this. I created a task in the windows task scheduler. My operating system was Windows 10 (The same steps will work in all the recent versions of windows)

Step 1: Open Task Scheduler

Search for Task Scheduler and Open the Task Scheduler

openvpn_OpenTaskScheduler

 

Step 2: Click on Create Task

Once you open the Task Scheduler, you can see several options. Select Create Task option to create a new task.

openvpn_createtask

Step 3: Configure the Task details

Start create the task by filling the following details in the General section.

openvpn_taskdetails

Step 4: Add new Trigger to the Task

Trigger is basically the parameter that tells the system when to trigger this action. We need to create a new Trigger for this task. Click on New and create a trigger as explained in the next step.

openvpn_createtrigger

Step 5: Configure the Trigger

We will configure the trigger details in this section. Choose Begin the task: At Start up. This means the task will be triggered during the startup of the system. Further tweaking can be made by setting the parameters in the advanced settings section.

openvpn_trigger

Step 6: Create new Action

This is the main section. This is the action that gets triggered by the task. Here we need to select action as “Start a program”. 

openvpn_create_action

Step 7: Configure Action

Our program is the openvpn client. Browse to the openvpn client installation and select the openvpn-gui.exe. The main part is the arguments section. We need to specify the config file in which we need to connect. Here my config file name is amal.ovpn and it is located in the config directory of openvpn installation. If we miss this argument, the openvpn auto connect will not work. To test this command, the simple thing that we can do is by directly executing the command in the command line (Powershell is recommended).

Eg: Go to the bin directory of OpenVPN (C:\Program Files\OpenVPN\bin) and open powershell there.

Execute the following command. The “amal.ovpn” can be replaced with your vpn config file name.

openvpn_powershell_testing If the above command is working fine, complete the action configuration and save the details.

openvpn_action_info

Note: The amal.ovpn is the vpn configuration file and is located in the OpenVPN config directory which defaults to “C:\Program Files\OpenVPN\config”

After configuring this, click on ok and save the task. Then test this task by rebooting the system. I have configured this set up several times in several places and it worked perfectly.

Hope this article helped you 🙂 . If you are facing any issues, please comment on this post, I will be happy to help you.

 

 

What is Swap memory and How to clear Swap usage in Linux ?

What is Swap Space ?

Swap is a space on disk that is used by the system when the available memory in the RAM (physical memory) is completely utilized. This is basically to increase the available virtual memory in the system. The swap memory will be used once the physical memory is full. Since this is residing in the disk, the processing speed using this memory will be slow compared to the processing in physical memory (RAM).

Why we need swap space ?

Suppose we have a system with 4GB RAM. When we start the system the memory usage will be less. But as we open applications or start running processes, the memory utilization will increase. If it reaches the 4GB utilization, we will not be able to use any additional applications and we will have to wait to get some free space in the RAM. With swap memory, the allocated space in the disk will be used in case of any additional requirement and the applications will still run even after crossing the max limit of system RAM. As already explained, the performance of swap will be very slow as compared to RAM.

How the memory management works internally ?

The Linux kernel has a memory management process. This process monitors all the processes and identifies the less frequently used memory pages (or blocks). In case of additional memory requirement exceeding the RAM limit comes, this memory management program will utilize the space in system hard disk allocated for “swapping” or paging these less frequently used memory blocks. In this way the RAM will be freed up and the active memory for running live application will become available in the system.

How to clear the swap memory usage?

If you want to clear the swap memory, you can execute the following command in the terminal as root user.

swapoff -a && swapon -a

WARNING.!!!: Be careful doing this, as this may affect your system’s stability, especially if its already low on RAM. Better not to set these swap clearing scripts as cronjob.

Linux commands to check the disk utilization, size of directory or file

  • Command to check the disk utilization
df -h

The ‘-h’ option will provide the utilization in human readable format.

  • Command to check the size of a directory
du -sh <directory name>
  • Command to check the size of a file
du -sh <file name>
  • Command to check the size of files in a directory

Go inside the directory and execute the following command

du -sh *

 

How to maintain packages in a python project ?

In most of the cases, we might need external packages for the development of a python program. These external packages are either available in pypi repository or available locally as archive files. Usually people just installs the packages directly in the python environment using pip command.

The pip command by default installs the package from the pypi repository. If we are not specifying the version, it selects the latest available version of that package supported by the python present in the environment (Python 2 or 3). Because of this nature, the pip command will always pick up the latest version of the packages. The packages may undergo drastic changes in newer releases. For example, an application developed with version X of a package may not work with the version Y of the same package. So simply noting down the package names itself will not help to manage the project. We need the list of all packages with the versions. Also manually installing the packages one by one is also a difficult task, because there can be several tens of packages within a single project.

The best practices for managing packages in a project are

  1. Use python Virtual Environment.
  2. Create a requirements.txt to maintain the package details.

The details on how to create and use virtual environment is explained in my previous post.

requirements.txt is a simple text file to maintain all the package dependencies with versions. A sample format is given below

Click==7.0
Flask==1.0.2
Flask-Cors==3.0.7
itsdangerous==1.1.0
Jinja2==2.10
MarkupSafe==1.1.0
six==1.12.0
Werkzeug==0.14.1

Packages can be installed using a single command

pip install -r requirements.txt

Packages in an environment can be captured in a requirements.txt file in one shot using the following command.

pip freeze > requirements.txt

This practice will help developers to manage the dependency list and easy code migration.

Virtual environment in Python

What is a Virtual Environment ?

A virtual environment is a tool that helps developers to segregate and maintain the dependencies required by different projects by creating isolated python virtual environments.

Need for Virtual Environment ?

Suppose,  User A and User B are working on 2 projects.

The package requirements for user A is given below

Click==7.0
Flask==1.0.2
Flask-Cors==3.0.7
itsdangerous==1.1.0
Jinja2==2.10
MarkupSafe==1.1.0
six==1.12.0
Werkzeug==0.14.1

Also the package requirements for user B is given below.

Click==6.0
Flask==1.0.1
Flask-Cors==3.0.2
itsdangerous==1.1.0
Jinja2==2.10
MarkupSafe==1.1.0
six==1.11.0
Werkzeug==0.11.2

As you can see, the two developers use different versions of similar packages. Python does not have the ability to differentiate between multiple versions of the same package in the site-packages directory.  By default the package installation happens in the default site-packages directory of the python installation. In Unix like operating system, by default the location will be owned by the root user and a normal user will not be able to perform a package installation without elevated privileges.

Virtual environment plays its role in the following scenarios

  • To isolate the dependencies required for different projects
  • To maintain the base python packages untouched. In case of multi-user environment, upgrading or modifying a package might disrupt the operation of one or more projects.
  • To enable easy access for the installation and management of python packages to the end users without enabling system level elevated privileges.
  • To easily manage the dependencies used in a specific project. We can copy the virtual environment to another system of the same version. Also it is easy to replicate the environment by dumping the package list (pip freeze).

How to create a virtual environment ?

For creating a virtual environment, we need the virtual environment package installed in the base python environment.

As root user or with elevated privilege, execute the following command

pip install virtualenv

Then create the virtual environment with the following command. The virtualenv command will be available only if the package was installed in the base python.

virtualenv <path for virtual environment>

You can specify any writable location as the path for virtual environment. All the virtual environment related files and packages will be installed in this directory. It may take few seconds complete the virtual environment setup.

Now type

which python

this will be still pointing to the base python. For using the virtual environment, we need to activate the environment.

source <path of virtual environment>/bin/activate

The above command will activate the virtual environment in the current session. For making it enabled in all sessions by default, add these lines in the .bashrc file.

Now again type which python check the result. It will be pointing to the newly created virtual environment.

For deactivating the environment, simply type deactivate in the command line.

 

How to check whether a Raspberry Pi is 32 bit or 64 bit ?

The latest version of Raspberry Pi comes with 64 bit CPU, but prior to that it was with 32 bit CPU. Some softwares and applications are dependent on CPU and OS architecture.

There are various options to check the architecture.

Method 1:

type the following command and check the response

uname -m

You will get a response something like armv7l or armv8.

ARMv7 and below are 32-bit. AMRv8 introduces the 64-bit instruction set.

Method 2:

Install lshw using the command

apt-get install lshw

Then type the command lshw.  You will be able to find the architecture from the response of the command.