Advertisements

R and Big Data

Now R programming is getting more attention among people. The reason I found was that it can be used efficiently for big data analytics. R is a good statistical tool. Its applicability in big data analytics is very much. Now the system is trying to learn from data or else we are trying to teach the system using data. With advanced analytics with R programming, it is very easy to generate insights from large data. Now a lot of packages are available for R that makes it powerful and capable to work on top of latest Big data technologies. Some of the libraries that I have noticed are listed below.

1) Rhipe: RHIPE (hree-pay’) is the R and Hadoop Integrated Programming Environment.
For more details Rhipe

2) Rhive : RHive is an R extension facilitating distributed computing via Apache Hive.
For more details Rhive

3) Rhbase : This R package provides basic connectivity to HBASE, using the Thrift server. R programmers can browse, read, write, and modify tables stored in HBASE.
For more details Rhbase

4) Rhdfs : This R package provides basic connectivity to the Hadoop Distributed File System. R programmers can browse, read, write, and modify files stored in HDFS.
For more details Rhdfs

5) Rmr : This R package allows an R programmer to perform statistical analysis via MapReduce on a Hadoop cluster.
For more details Rmr

6) Plyrmr : This R package enables the R user to perform common data manipulation operations, as found in popular packages such as plyr and reshape2, on very large data sets stored on Hadoop. Like rmr, it relies on Hadoop mapreduce to perform its tasks, but it provides a familiar plyr-like interface while hiding many of the mapreduce details.
For more details Plyrmr

7) Rmongo : MongoDB Database interface for R. The interface is provided via Java calls to the mongo-java-driver.
For more details Rmongo

Advertisements