Neelesh Parulkar asked a question

How can R and Hadoop be used together?


1 answer
VISHNU SUBRAMANIAN Deep learning researcher.
Hadoop is an ecosystem of various components. Some of the components you may be interested to use from R could be SQL(Hive , Impala ,Spark SQL) or for datascience activities. 

SQL : An example scenario could be where you need to pull data from the underlying File system , which in Hadoop is HDFS. One simple way is to use a thrift server which exposes the data like how any database would do. 

DataScience tasks : You can use SparkR for building data pipeline like pulling data from multiple sources , cleaning the data , applying distribnuted Ml .

All these tools are best used based on the problem you are trying to solve . 

Vishnu Subramanian
