Hi All: I just found a data set that I would like to integrate with [codec] to test the language package:
http://sourceforge.net/projects/familynamephon/ The test data file contains 837K German names (37MB) in a text file and encodings in Cham (?) phonetics, Cologne phonetics, Metaphone, and Soundex. I have no idea how long it would take to run a test for our language encoders on this but I imagine making it an optional unit test. How do you do THAT in Maven? The data is covered (I think, I do not read German) by this license: http://www.opendatacommons.org/licenses/odbl/1.0/ Thoughts? Gary Gregory Senior Software Engineer Rocket Software 3340 Peachtree Road, Suite 820 * Atlanta, GA 30326 * USA Tel: +1.404.760.1560 Email: ggreg...@seagullsoftware.com<mailto:ggreg...@seagullsoftware.com> Web: seagull.rocketsoftware.com<http://www.seagull.rocketsoftware.com/>