Hi,

does anybody know of an open-source implementation of the Broder
algorithm<http://www.std.org/%7Emsm/common/clustering.html>in Hadoop?
Monika Henzinger reports
having done <http://ltaa.epfl.ch/monika/mpapers/nearduplicates2006.pdf> so
in MapReduce, and I wonder if somebody has repeated her work in open source?

I am going to do this if there is no implementation yet, and then I will ask
what I can do with the code.

Cheers,
Mark

Reply via email to