Re: Keyword extraction from pdf to text

2010-11-30 Thread Ian Lea
If I've understood you correctly, you want to pump text into a lucene Analyzer and grab the output and do something else with that. If that is right, you can use code based on something like this: for (String s : array-of-input-texts) { Analyzer anl = new xxxAnalyzer(whatever)

Keyword extraction from pdf to text

2010-11-30 Thread McGibbney, Lewis John
Hello list, I am currently attempting to extract keywords from pdf documents, my aim is then to begin constructing a domain ontology using the words which are extracted. I do not need to index anything at this stage, but wish to extract and push the output as plain text into a text file. An exa