Wednesday, 23 September 2015

Create ISO Boot Image Using ISOLinux and MKISOFS Uitility on Linux Box

Dear Viewers

This post will help you to create ISO boot image using ISOLinux and mkisofs utility on Linux box.

Prerequisite:
1. Syslinux archive.
2. Binary files, source code files and other required configuration file of any preferable  Linux flavor.
3. mkisofs utility installed.


Tuesday, 22 September 2015

Make and Cmake Utility Example in Linux


Dear Viewers

This post will help you to use Make and Cmake utility in Linux for auto compiling and building C language project. 

Prerequisite:
1. Basic knowledge of C language. 
2. GCC compiler on your Linux box. 
3. Make and Cmake utility installed. (On Linux terminal, type man make and man cmake for know more details on Linux terminal) 


Concept:
In your daily routine you might have come across a situation where a big software application need to be split-up into small-small module.
There is a possibility that these each one of these modules have thousands lines of source code written in them. Imagine that these modules 
have been developed by different-different developer who actually sits on different-different locations. In such scenario, whenever a developer 
make changes to source code, he is expected to compile that code manually. Same applies to other developers. This actually seems to be a tedious 
job considering each time manual compilation. Can we think for solution to this problem? Yes. We have a solution for this problem. That is Make 
and Cmake utility. Both are explained below with their practical demonstration. 
 

Monday, 21 September 2015

Hadoop MapReduce WordCount Program in Java

Dear Viewers,
 
This post will help you to run Hadoop MapReduce Word Count program in Java. 

Prerequisite:
1. Running Hadoop environment on your Linux box. (I have Hadoop 2.6.0)
2. Java installed on your Linux box. (I have Java 1.7.0)
3. External jar - hadoop-core-1.2.1.jar
4. Text input file  (I have Inputfile.txt)

Flow:
1. Prepare 3 Java source code files namely – WordCount.java, WordMapper.java, WordReducer.java
2. WordCount.java is a main class. You may also refer this as a Driver class. From it's source 
code, it refers to WordMapper.class and WordReducer.class
3. WordMapper.java file splits up the user input and as an output generates <key,value> pair. 
That is <word, and its count>.
4. WordReducer.java accepts output of Mapper as an input. It means it combines output provided 
by WordMapper.class and generate final output which is also a <key,value> pair. In final output 
it indicates how many time each word has occurred. 
5. Compile source code files and make use of external jar file. 
6. Post successful compilation, create jar file by putting together all .class files
7. Run your program. Syntax to be followed while running this program is as below. 
$ hadoop jar jar-file-name Driver/main-class-name Input-file-name-on-hdfs 
Output-file-directory-on-hdfs
7. Output file on HDFS generated by Mapper class will have a name “part-m-00000” and Reducer 
class will have a name part-r-00000. So open file part-r-00000 from terminal to see final output. 

Sunday, 20 September 2015

Hadoop MapReduce WordCount Program in Python

Dear Viewers,
This post will help you to run Hadoop MapReduce Word Count program in Python.



Prerequisite:
1. Running Hadoop environment on your Linux box. (I have Hadoop 2.6.0)
2. Python installed on your Linux box. (I have Python 2.7.3)
3. External jar - hadoop-streaming-2.6.0.jar
4. Text input file (I have Employee.txt)



Flow:
  1. Prepare 2 python source code file namely - Mapper.py and Reducer.py
  2. Mapper.py file splits up the user input and as an output generate <key,value> pair. That is <word, and its count>.
  3. Reducer.py accepts output of Mapper as an input. It means it combines output provided by Mapper.py and generate final output which is also a <key,value> pair. In final output it indicates how many time each word has occurred.
  4. To run these files from Linux terminal, change the permission of these files to executable permission. To do so, you may use command  $chmod +x *.py
  5. Syntax to be followed while running this program is as below. '\' on terminal indicates line continuation.
    $ hadoop jar hadoop-streaming-2.6.0.jar \
    -file filename -mapper mapperfile \
    -file filename -reducer reducerfile \
    -input inputfilename \
    -output outputfiledirectory
  6. Output file on HDFS will have name “part-00000”. So open this file from terminal to see final output.