C/C++ MapReduce Code & build
This is the WordCount example using C/C++.
#include "hadoop/Pipes.hh" #include "hadoop/TemplateFactory.hh" #include "hadoop/StringUtils.hh" class WordCountMap: public HadoopPipes::Mapper { public: WordCountMap(HadoopPipes::TaskContext& context){} void map(HadoopPipes::MapContext& context) { std::vector<std::string> words = HadoopUtils::splitString(context.getInputValue(), " "); for(unsigned int i=0; i < words.size(); ++i) { context.emit(words[i], "1"); } } }; class WordCountReduce: public HadoopPipes::Reducer { public: WordCountReduce(HadoopPipes::TaskContext& context){} void reduce(HadoopPipes::ReduceContext& context) { int sum = 0; while (context.nextValue()) { sum += HadoopUtils::toInt(context.getInputValue()); } context.emit(context.getInputKey(), HadoopUtils::toString(sum)); } }; int main(int argc, char *argv[]) { return HadoopPipes::runTask(HadoopPipes::TemplateFactory<WordCountMap, WordCountReduce>()); }
To compile the example, build the Hadoop code and the C/C++ word count example:
# ant -Dcompile.c++=yes examples
Upload C++ binary files to HDFS
To upload the binary files to HDFS, the command syntax is:
# bin/hadoop fs -put build/c++-examples/Linux-i386-32/bin /examples/bin
Set the MapReduce Config
# vi src/examples/pipes/conf/word.xml <?xml version="1.0"?> <configuration> <property> // Set the binary path on DFS <name>hadoop.pipes.executable</name> <value>/examples/bin/wordcount</value> </property> <property> <name>hadoop.pipes.java.recordreader</name> <value>true</value> </property> <property> <name>hadoop.pipes.java.recordwriter</name> <value>true</value> </property> </configuration>
Execute
To run the example, the command syntax is:
# bin/hadoop pipes -conf src/examples/pipes/conf/word.xml -input in-dir -output out-dir