HDFS Operations Using Java Program

We are familiar with Hadoop Distributed File System operations such as copyFromLocal, copyToLocal, mv, cp, rmr etc.
Here I am explaining the method to do these operations using Java API. Currently I am explaining the programs to do copyFromLocal and copyToLocal functions only.

Here I used eclipse IDE for programming which is installed in my windows desktop machine.
I have a hadoop cluster. The cluster machines and my destop machine are in the same network.

First create a java project and inside that create a folder named conf. Copy the hadoop configuration files (core-site.xml, mapred-site.xml, hdfs-site.xml) from your hadoop installation to this conf folder.

Create another folder named source which we are using as the input location and put a text file inside that source folder.
One thing you have to remember is that the source and destination locations will be given appropriate permissions. Otherwise read/write will be blocked.

Copying a File from Local to HDFS

The command is
hadoop fs -copyFromLocal

package com.amal.hadoop;
import java.io.IOException;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;

/**
 * @author amalgjose
 *
 */
public class CopyFromLocal {

	public static void main(String[] args) throws IOException {
		
		Configuration conf =new Configuration();
		conf.addResource(new Path("conf/core-site.xml"));
		conf.addResource(new Path("conf/mapred-site.xml"));
		conf.addResource(new Path("conf/hdfs=site.xml"));
		FileSystem fs = FileSystem.get(conf);
		Path sourcePath = new Path("source");
		Path destPath = new Path("/user/training");
		if(!(fs.exists(destPath)))
		{
			System.out.println("No Such destination exists :"+destPath);
			return;
		}
		
		fs.copyFromLocalFile(sourcePath, destPath);
		
	}
}

Copying a File from HDFS to Local

The command is
hadoop fs -copyToLocal

package com.amal.hadoop;
import java.io.IOException;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
/**
 * @author amalgjose
 *
 */
public class CopyToLocal {
public static void main(String[] args) throws IOException {
		
		Configuration conf =new Configuration();
		conf.addResource(new Path("conf/core-site.xml"));
		conf.addResource(new Path("conf/mapred-site.xml"));
		conf.addResource(new Path("conf/hdfs=site.xml"));
		FileSystem fs = FileSystem.get(conf);
		Path sourcePath = new Path("/user/training");
		Path destPath = new Path("destination");
		if(!(fs.exists(sourcePath)))
		{
			System.out.println("No Such Source exists :"+sourcePath);
			return;
		}
		
		fs.copyToLocalFile(sourcePath, destPath);
		
	}
}

Simple Tag Cloud Generation Using Java program

A Tag cloud is a visual representation of text data. In this tags are words, where the importance is highlighted using colour or font size. This is very popular now to analyse contents of websites. This helps in quickly perceiving the most important words. The importance is calculated by counting the number of occurance. Thus based on occurance, weightage is given to each word(tag). After analysing the whole text, it is displayed based on it weightage. Thus tag cloud will be generated. open cloud is a java library for generating tag clouds. Here I used Open cloud library for the generation of Tag cloud. Normally we need a webserver for getting a good UI of the TagCloud, here we are displaying the cloud using Swing. This is a sample program for the generation of a simple tag Cloud. For this download the Open Cloud Library.

package tagcloud;

import java.util.Random;

import javax.swing.JFrame;
import javax.swing.JLabel;
import javax.swing.JPanel;
import javax.swing.SwingUtilities;

import org.mcavallo.opencloud.Cloud;
import org.mcavallo.opencloud.Tag;

public class TestOpenCloud {

private static final String[] WORDS = { "amal", "india", "hello", "amal", "birthday", "amal", "hello", "california", "america", "software",
 "cat", "bike", "car", "christmas", "city", "zoo", "amal", "asia", "family", "festival", "flower", "flowers", "food",
 "little", "friends", "fun", "amal", "outing", "india", "weekend", "india", "software", "me", "music", "music", "music",
 "new", "love", "night", "nikon", "morning", "love", "park", "software", "people", "portrait", "flower", "sky", "travelling",
 "spain", "summer", "sunset", "india", "city", "india", "amal", "uk", "usa", "", "water", "wedding","cool","happy","friends","best","trust","good",
 "enjoy","cry","laugh"};

protected void initUI() {
 JFrame frame = new JFrame(TestOpenCloud.class.getSimpleName());
 frame.setDefaultCloseOperation(JFrame.EXIT_ON_CLOSE);
 JPanel panel = new JPanel();
 Cloud cloud = new Cloud();
 Random random = new Random();
 for (String s : WORDS) {
 for (int i = random.nextInt(50); i > 0; i--) {
 cloud.addTag(s);
 }
 }
 for (Tag tag : cloud.tags()) {
 final JLabel label = new JLabel(tag.getName());
 label.setOpaque(false);
 label.setFont(label.getFont().deriveFont((float) tag.getWeight() * 10));
 panel.add(label);
 }
 frame.add(panel);
 frame.setSize(800, 600);
 frame.setVisible(true);
 }

public static void main(String[] args) {
 SwingUtilities.invokeLater(new Runnable() {
 @Override
 public void run() {
 new TestOpenCloud().initUI();
 }
 });
 }

}