Monday, March 14, 2016

Configuring Jaql in your hadoop cluster

Dear Technocrats,

Jaql is an scripting language specifically designed for JSON files. These files are generally generated by web searches for eg. Twitter Analysis.

In this post I am showing the procedure to configure jaql in your hadoop cluster.
Its again very simple procedure to follow up.

Just download jaql tar file online from apache store. keep anywhere in your system and copy its path. Set this path as JAQL_HOME in your bashrc file.

then go to bin and just type ./jaqlshell. You will be redirected to jaql shell prompt for jaql scripting.



Note: adding the path in .bashrc file as as you opened it via vim or gedit
                  $ gedit ~/.bashrc         or       $ vim ~/.bashrc
you will see bashrc file, just go to the end of the file and make the entry of your jaql home as shown below:


Save the file and start ./jaqlshell to do the analysis using jaql. Jaql fits best to analyze the json files as in twitter trend analysis.

Enjoy Jaql scripting.

To get more frequent updates, like our page.

or

Go to the Home-page of this blog.

Friday, March 11, 2016

Configuring hbase on hadoop cluster

Dear Technocrats,


Greetings...

In this post I am showing the procedure to install and use hbase shell on your hadoop cluster.

first download hbase stable version from apache site. unzip the tar file and go to it. Make the necessary changes in bin/hbase-site.xml file as shown.


 Also mention JAVA_HOME=path to your java in conf/hbase-evn.sh save and exit.

Then run the start-hbase.sh from bin as shown in screenshot...


Now hbase is running on your hadoop cluster as shown above. But hbase does not directs you to the hbase shell by default.

To use hbase shell you will have to run the command ./bin/hbase shell in $HBASE_HOME. Now you can use your hbase shell, program over there and come out of this after programming by typing quit on hbase shell. Complete procedure is as shown:



Enjoy hbase scripting. Wish you all the best... (y)

Thursday, March 10, 2016

Pig installation on hadoop cluster

Dear Technocrats,


In this post, I am explaining the procedure of pig installation on your hadoop cluster ( we have hadoop 2.6.0 configured cluster in our lab).

Pig configuration:
1. download pig-0.x.x
2. Set java path in .bashrc file.
3. pig-0.x.x/bin directory.
4. run the command $ ./pig -x local
5. It will bring you up to grunt shell as shown in the screen-shot below:



grunt is the pig shell. So enjoy scripting using pig on your cluster.

(Note: Pig installation needs java 7 or higher. So if you are using any lower version, then switch to higher version)

Working on grunt shell:

Let you have two files each having two columns. You wish to join those tables corresponding to one common column. It is as simple as 4 lines of code in pig terminal as shown in program screen shot below:


Analysis of Sample healthcare dataset:

This is the analysis of sample dataset of healthcare record having entries of patient name, disease, gender, age etc. Here is the procedure to find the number of male candidates suffering from Swine Flu.



For more frequent updates visit our page