On Jul 15, 2005, at 3:12 PM, [EMAIL PROTECTED] wrote:
If Microsoft Search does as you describe. Isn't it just:
1) Open file
2) Determine file type
3) Convert file content to UTF8, if text based, and you have the
API to read it. .html, .txt., .doc, .excel, etc.
4) Perform string search, rege
As somebody already said, you can have an in-memory index with
RAMDirectory. You can also pre-build a Lucene index on that CD - CD is
"static", you can't add/remove/change files on it, so you can build an
index and burn it onto the CD at the same time when you put the Word
files on it.
As for get
I imagine you could index the info you wanted to quickly search on into a
RAMDirectory (assuming it wasn't too much info), then run simple or complex
searches on that, but I that might take longer to do than simple regex
searching on files. That would only give you a gain if you were going to run
r
If Microsoft Search does as you describe. Isn't it just:
1) Open file
2) Determine file type
3) Convert file content to UTF8, if text based, and you have the API to
read it. .html, .txt., .doc, .excel, etc.
4) Perform string search, regex.
5) Continue to next file
As far as I know, Lucene is n