As part of our refactoring project, we'd like to find duplicated code. Our hand-rolled scripts do a decent job, but could use a lot of work. Rather than do a lot of work, I'm curious to know if anyone knows of any tools already out there for that.[snip]
Any suggestions? I'd be rather curious to hear about something that operates on the op-code level and can possibly cope with renamed variables as a result.
I don't know of anything Perl specific, certainly not at the opcode level.
I've had some success throwing everything through perltidy to normalise the code then applying comparator <http://www.catb.org/~esr/comparator/>. This is all purely textual but works surprisingly well, and has the bonus of involving almost no actual work :-)
You may want to take a look at CPD <http://pmd.sourceforge.net/cpd.html>, which does duplicate code detection for Java, C, C++, and PHP. Details on the algorithm at <http://dogma.net/markn/articles/bwt/bwt.htm>.
Cheers,
Adrian