Hi,
I have a simple Map Reduce algorithm that I would like to implement. Right now, I am in the process of searching for a framework, and I came across Skynet. I am not yet familiar with Ruby, but I was generally impressed with what I saw.
My map reduce algorithm works similarly to the following:
I have one text file that contains lines of text. For each line of text, I want to assign one map function to it; that map function will process the line as it sees fit and retain line information (e.g. if it is line 1, line 2, etc.). What gets passed to the reduce function is an ordered pair, which contains the line number and some processed information).
For example, if I were locating words, and I had two lines:
Hello World
Goodbye World
There is a mapper assigned to line 1 and another assigned to line 2.
The first mapper emits:
<"Hello", 1> <"World", 1>
And the second mapper emits:
<"Goodbye", 2> <"World", 2>
Please keep in mind that this file is sufficiently large that it does not make any sense to store each line in one file.
Given this information, is Skynet right for me? I would appreciate your thoughts, and any suggestions you may have as per other platforms to look at. I am currently also looking at Hadoop and Erlang.
Thank you very much for your time. I look forward to your response.
-SM
|