Twister: Iterative MapReduce
http://www.iterativemapreduce.org/
tags: mapreduce
mrjob at master from Yelp's mrjob - GitHub
http://github.com/Yelp/mrjob/tree/master/mrjob/
tags: mapreduce yelp
Cassandra Project
Cassandra is a highly scalable, eventually consistent, distributed, structured key-value store. Cassandra brings together the distributed systems technologies from Dynamo and the data model from Google's BigTable. Like Dynamo, Cassandra is eventually consistent. Like BigTable, Cassandra provides a ColumnFamily-based data model richer than typical key/value systems. Cassandra was open sourced by Facebook in 2008, where it was designed by one of the authors of Amazon's Dynamo. In a lot of ways you can think of Cassandra as Dynamo 2.0. Cassandra is in production use at Facebook but is still under heavy development.
http://incubator.apache.org/cassandra/
tags: amazon cassandra mapreduce distributed database
Gearman
Gearman provides a generic application framework to farm out work to other machines or processes that are better suited to do the work. It allows you to do work in parallel, to load balance processing, and to call functions between languages. It can be used in a variety of applications, from high-availability web sites to the transport of database replication events. In other words, it is the nervous system for how distributed processing communicates. A few strong points about Gearman:
http://gearman.org/
tags: performance gearman mapreduce
mrtoolkit - Google Code
MRToolkit provides a framework for building simple Map/Reduce jobs in just a few lines of code. You provide only the map and reduce logic, the framework does the rest. Or use one of the provided map or reduce tools, and write even less. Map and reduce jobs are written in ruby. It is similar to Google's Sawzall
http://code.google.com/p/mrtoolkit/
tags: mapreduce
Collaborative Map-Reduce in the Browser - igvita.com
http://www.igvita.com/2009/03/03/collaborative-map-reduce-in...
tags: mapreduce javascript
Michael Nielsen » Write your first MapReduce program in 20 ...
(Credit to a nice blog post from Dave Spencer for the use of itertools.groupby to simplify the reduce phase.)
http://michaelnielsen.org/blog/?p=529
tags: mapreduce
I’m Sorry Dave - Dave Spencer’s Weblog » Blog Archive Â...
I’ve realized that I understand things best when I implement them myself, and I was recently reading Trevor Strohman’s dissertation, intriguied by TupleFlow, a kind of more elaborate and improved MapReduce, and was about to write my own toy impl of T
http://www.tropo.com/dave/blog/2008/07/09/mapreduce-in-10-or...
tags: python mapreduce
Tom White: "Disks have become tapes"
MapReduce is a programming model for processing vast amounts of data. One of the reasons that it works so well is because it exploits a sweet spot of modern disk drive technology trends. In essence MapReduce works by repeatedly sorting and merging data th
http://www.lexemetech.com/2008/03/disks-have-become-tapes.ht...
tags: mapreduce
|